Title: | Cross-Validation for Model Selection |
Version: | 1.8.0 |
Description: | Cross-validate one or multiple regression and classification models and get relevant evaluation metrics in a tidy format. Validate the best model on a test set and compare it to a baseline evaluation. Alternatively, evaluate predictions from an external model. Currently supports regression and classification (binary and multiclass). Described in chp. 5 of Jeyaraman, B. P., Olsen, L. R., & Wambugu M. (2019, ISBN: 9781838550134). |
License: | MIT + file LICENSE |
URL: | https://github.com/ludvigolsen/cvms |
BugReports: | https://github.com/ludvigolsen/cvms/issues |
Depends: | R (≥ 3.5) |
Imports: | checkmate (≥ 2.0.0), data.table (≥ 1.12), dplyr (≥ 0.8.5), ggplot2, groupdata2 (≥ 2.0.5), lifecycle, lme4 (≥ 1.1-23), MuMIn (≥ 1.43.17), parameters (≥ 0.15.0), plyr, pROC (≥ 1.16.0), purrr, rearrr (≥ 0.3.4), recipes (≥ 0.1.13), rlang (≥ 0.4.7), stats, stringr, tibble (≥ 3.0.3), tidyr (≥ 1.1.2), utils |
Suggests: | AUC, covr (≥ 3.3.1), e1071 (≥ 1.7-2), furrr, ggimage (≥ 0.3.3), ggnewscale (≥ 0.5.0), knitr, merDeriv (≥ 0.2-4), nnet (≥ 7.3-20), randomForest (≥ 4.6-14), rmarkdown, rsvg, testthat (≥ 2.3.2), xpectr (≥ 0.4.3) |
VignetteBuilder: | knitr |
RdMacros: | lifecycle |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2025-07-07 19:36:41 UTC; au547627 |
Author: | Ludvig Renbo Olsen
|
Maintainer: | Ludvig Renbo Olsen <r-pkgs@ludvigolsen.dk> |
Repository: | CRAN |
Date/Publication: | 2025-07-08 02:20:02 UTC |
cvms: A package for cross-validating regression and classification models
Description
Perform (repeated) cross-validation on a list of model formulas. Validate the best model on a validation set. Perform baseline evaluations on your test set. Generate model formulas by combining your fixed effects. Evaluate predictions from an external model.
Details
Returns results in a tibble
for easy comparison, reporting and further analysis.
The main functions are:
cross_validate()
,
cross_validate_fn()
,
validate()
,
validate_fn()
,
baseline()
,
and evaluate()
.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Useful links:
Create baseline evaluations
Description
Create a baseline evaluation of a test set.
In modelling, a baseline is a result that
is meaningful to compare the results from our models to. For instance, in
classification, we usually want our results to be better than random guessing.
E.g. if we have three classes, we can expect an accuracy of 33.33%
, as for every
observation we have 1/3
chance of guessing the correct class. So our model should achieve
a higher accuracy than 33.33%
before it is more useful to us than guessing.
While this expected value is often fairly straightforward to find analytically, it
only represents what we can expect on average. In reality, it's possible to get far better
results than that by guessing.
baseline()
(binomial
, multinomial
)
finds the range of likely values by evaluating multiple sets
of random predictions and summarizing them with a set of useful descriptors.
If random guessing frequently obtains an accuracy of 40%
, perhaps our model
should have better performance than this, before we declare it better than guessing.
How
When `family`
is binomial
: evaluates `n`
sets of random predictions
against the dependent variable, along with a set of all 0
predictions and
a set of all 1
predictions. See also baseline_binomial()
.
When `family`
is multinomial
: creates one-vs-all (binomial)
baseline evaluations for `n`
sets of random predictions against the dependent variable,
along with sets of "all class x,y,z,..." predictions.
See also baseline_multinomial()
.
When `family`
is gaussian
: fits baseline models (y ~ 1
) on `n`
random
subsets of `train_data`
and evaluates each model on `test_data`
. Also evaluates a
model fitted on all rows in `train_data`
.
See also baseline_gaussian()
.
Wrapper functions
Consider using one of the wrappers, as they are simpler to use and understand:
baseline_gaussian()
,
baseline_multinomial()
, and
baseline_binomial()
.
Usage
baseline(
test_data,
dependent_col,
family,
train_data = NULL,
n = 100,
metrics = list(),
positive = 2,
cutoff = 0.5,
random_generator_fn = runif,
random_effects = NULL,
min_training_rows = 5,
min_training_rows_left_out = 3,
REML = FALSE,
parallel = FALSE
)
Arguments
test_data |
|
dependent_col |
Name of dependent variable in the supplied test and training sets. |
family |
Name of family. (Character) Currently supports |
train_data |
|
n |
Number of random samplings to perform. (Default is For For |
metrics |
E.g. You can enable/disable all metrics at once by including
The Also accepts the string |
positive |
Level from dependent variable to predict.
Either as character (preferable) or level index ( E.g. if we have the levels Note: For reproducibility, it's preferable to specify the name directly, as
different Used when calculating confusion matrix metrics and creating N.B. Only affects evaluation metrics, not the returned predictions. N.B. Binomial only. (Character or Integer) |
cutoff |
Threshold for predicted classes. (Numeric) N.B. Binomial only |
random_generator_fn |
Function for generating random numbers when The first argument must be the number of random numbers to generate, as no other arguments are supplied. To test the effect of using different functions,
see N.B. Multinomial only |
random_effects |
Random effects structure for the Gaussian baseline model. (Character) E.g. with N.B. Gaussian only |
min_training_rows |
Minimum number of rows in the random subsets of Gaussian only. (Integer) |
min_training_rows_left_out |
Minimum number of rows left out of the random subsets of I.e. a subset will maximally have the size:
N.B. Gaussian only. (Integer) |
REML |
Whether to use Restricted Maximum Likelihood. (Logical) N.B. Gaussian only. (Integer) |
parallel |
Whether to run the Remember to register a parallel backend first.
E.g. with |
Details
Packages used:
Models
Gaussian: stats::lm
, lme4::lmer
Results
Gaussian:
r2m : MuMIn::r.squaredGLMM
r2c : MuMIn::r.squaredGLMM
AIC : stats::AIC
AICc : MuMIn::AICc
BIC : stats::BIC
Binomial and Multinomial:
ROC and related metrics:
Binomial: pROC::roc
Multinomial: pROC::multiclass.roc
Value
list
containing:
a
tibble
with summarized results (calledsummarized_metrics
)a
tibble
with random evaluations (random_evaluations
)a
tibble
with the summarized class level results (summarized_class_level_results
) (Multinomial only)
—————————————————————-
Gaussian Results
—————————————————————-
The Summarized Results tibble
contains:
Average RMSE
, MAE
, NRMSE(IQR)
,
RRSE
, RAE
, RMSLE
.
See the additional metrics (disabled by default) at ?gaussian_metrics
.
The Measure column indicates the statistical descriptor used on the evaluations.
The row where Measure == All_rows
is the evaluation when the baseline model
is trained on all rows in `train_data`
.
The Training Rows column contains the aggregated number of rows used from `train_data`
,
when fitting the baseline models.
....................................................................
The Random Evaluations tibble
contains:
The non-aggregated metrics.
A nested tibble
with the predictions and targets.
A nested tibble
with the coefficients of the baseline models.
Number of training rows used when fitting the baseline model on the training set.
A nested Process information object with information about the evaluation.
Name of dependent variable.
Name of fixed effect (bias term only).
Random effects structure (if specified).
—————————————————————-
Binomial Results
—————————————————————-
Based on the generated test set predictions,
a confusion matrix and ROC
curve are used to get the following:
ROC
:
AUC
, Lower CI
, and Upper CI
Note, that the ROC
curve is only computed when AUC
is enabled.
Confusion Matrix
:
Balanced Accuracy
,
Accuracy
,
F1
,
Sensitivity
,
Specificity
,
Positive Predictive Value
,
Negative Predictive Value
,
Kappa
,
Detection Rate
,
Detection Prevalence
,
Prevalence
, and
MCC
(Matthews correlation coefficient).
....................................................................
The Summarized Results tibble
contains:
The Measure column indicates the statistical descriptor used on the evaluations.
The row where Measure == All_0
is the evaluation when all predictions are 0
.
The row where Measure == All_1
is the evaluation when all predictions are 1
.
The aggregated metrics.
....................................................................
The Random Evaluations tibble
contains:
The non-aggregated metrics.
A nested tibble
with the predictions and targets.
A list
of ROC curve objects (if computed).
A nested tibble
with the confusion matrix.
The Pos_
columns tells you whether a row is a
True Positive (TP
), True Negative (TN
), False Positive (FP
),
or False Negative (FN
), depending on which level is the "positive" class.
I.e. the level you wish to predict.
A nested Process information object with information about the evaluation.
Name of dependent variable.
—————————————————————-
Multinomial Results
—————————————————————-
Based on the generated test set predictions,
one-vs-all (binomial) evaluations are performed and aggregated
to get the same metrics as in the binomial
results
(excluding MCC
, AUC
, Lower CI
and Upper CI
),
with the addition of Overall Accuracy and multiclass
MCC in the summarized results.
It is possible to enable multiclass AUC as well, which has been
disabled by default as it is slow to calculate when there's a large set of classes.
Since we use macro-averaging, Balanced Accuracy
is the macro-averaged
metric, not the macro sensitivity as sometimes used.
Note: we also refer to the one-vs-all evaluations as the class level results.
....................................................................
The Summarized Results tibble
contains:
Summary of the random evaluations.
How: First, the one-vs-all binomial evaluations are aggregated by repetition,
then, these aggregations are summarized. Besides the
metrics from the binomial evaluations (see Binomial Results above), it
also includes Overall Accuracy
and multiclass MCC
.
The Measure column indicates the statistical descriptor used on the evaluations.
The Mean, Median, SD, IQR, Max, Min,
NAs, and INFs measures describe the Random Evaluations tibble
,
while the CL_Max, CL_Min, CL_NAs, and
CL_INFs describe the Class Level results.
The rows where Measure == All_<<class name>>
are the evaluations when all
the observations are predicted to be in that class.
....................................................................
The Summarized Class Level Results tibble
contains:
The (nested) summarized results for each class, with the same metrics and descriptors as
the Summarized Results tibble
. Use tidyr::unnest
on the tibble
to inspect the results.
How: The one-vs-all evaluations are summarized by class.
The rows where Measure == All_0
are the evaluations when none of the observations
are predicted to be in that class, while the rows where Measure == All_1
are the
evaluations when all of the observations are predicted to be in that class.
....................................................................
The Random Evaluations tibble
contains:
The repetition results with the same metrics as the Summarized Results tibble
.
How: The one-vs-all evaluations are aggregated by repetition.
If a metric contains one or more NAs
in the one-vs-all evaluations, it
will lead to an NA
result for that repetition.
Also includes:
A nested tibble
with the one-vs-all binomial evaluations (Class Level Results),
including nested Confusion Matrices and the
Support column, which is a count of how many observations from the
class is in the test set.
A nested tibble
with the predictions and targets.
A list
of ROC curve objects.
A nested tibble
with the multiclass confusion matrix.
A nested Process information object with information about the evaluation.
Name of dependent variable.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other baseline functions:
baseline_binomial()
,
baseline_gaussian()
,
baseline_multinomial()
Examples
# Attach packages
library(cvms)
library(groupdata2) # partition()
library(dplyr) # %>% arrange()
library(tibble)
# Data is part of cvms
data <- participant.scores
# Set seed for reproducibility
set.seed(1)
# Partition data
partitions <- partition(data, p = 0.7, list_out = TRUE)
train_set <- partitions[[1]]
test_set <- partitions[[2]]
# Create baseline evaluations
# Note: usually n=100 is a good setting
# Gaussian
baseline(
test_data = test_set, train_data = train_set,
dependent_col = "score", random_effects = "(1|session)",
n = 2, family = "gaussian"
)
# Binomial
baseline(
test_data = test_set, dependent_col = "diagnosis",
n = 2, family = "binomial"
)
# Multinomial
# Create some data with multiple classes
multiclass_data <- tibble(
"target" = rep(paste0("class_", 1:5), each = 10)
) %>%
dplyr::sample_n(35)
baseline(
test_data = multiclass_data,
dependent_col = "target",
n = 4, family = "multinomial"
)
# Parallelize evaluations
# Attach doParallel and register four cores
# Uncomment:
# library(doParallel)
# registerDoParallel(4)
# Binomial
baseline(
test_data = test_set, dependent_col = "diagnosis",
n = 4, family = "binomial"
#, parallel = TRUE # Uncomment
)
# Gaussian
baseline(
test_data = test_set, train_data = train_set,
dependent_col = "score", random_effects = "(1|session)",
n = 4, family = "gaussian"
#, parallel = TRUE # Uncomment
)
# Multinomial
(mb <- baseline(
test_data = multiclass_data,
dependent_col = "target",
n = 6, family = "multinomial"
#, parallel = TRUE # Uncomment
))
# Inspect the summarized class level results
# for class_2
mb$summarized_class_level_results %>%
dplyr::filter(Class == "class_2") %>%
tidyr::unnest(Results)
# Multinomial with custom random generator function
# that creates very "certain" predictions
# (once softmax is applied)
rcertain <- function(n) {
(runif(n, min = 1, max = 100)^1.4) / 100
}
baseline(
test_data = multiclass_data,
dependent_col = "target",
n = 6, family = "multinomial",
random_generator_fn = rcertain
#, parallel = TRUE # Uncomment
)
Create baseline evaluations for binary classification
Description
Create a baseline evaluation of a test set.
In modelling, a baseline is a result that
is meaningful to compare the results from our models to. For instance, in
classification, we usually want our results to be better than random guessing.
E.g. if we have three classes, we can expect an accuracy of 33.33%
, as for every
observation we have 1/3
chance of guessing the correct class. So our model should achieve
a higher accuracy than 33.33%
before it is more useful to us than guessing.
While this expected value is often fairly straightforward to find analytically, it
only represents what we can expect on average. In reality, it's possible to get far better
results than that by guessing.
baseline_binomial()
finds the range of likely values by evaluating multiple sets
of random predictions and summarizing them with a set of useful descriptors. Additionally,
it evaluates a set of all 0
predictions and
a set of all 1
predictions.
Usage
baseline_binomial(
test_data,
dependent_col,
n = 100,
metrics = list(),
positive = 2,
cutoff = 0.5,
parallel = FALSE
)
Arguments
test_data |
|
dependent_col |
Name of dependent variable in the supplied test and training sets. |
n |
The number of sets of random predictions to evaluate. (Default is |
metrics |
E.g. You can enable/disable all metrics at once by including
The Also accepts the string |
positive |
Level from dependent variable to predict.
Either as character (preferable) or level index ( E.g. if we have the levels Note: For reproducibility, it's preferable to specify the name directly, as
different Used when calculating confusion matrix metrics and creating N.B. Only affects evaluation metrics, not the returned predictions. |
cutoff |
Threshold for predicted classes. (Numeric) |
parallel |
Whether to run the Remember to register a parallel backend first.
E.g. with |
Details
Packages used:
ROC
and AUC
: pROC::roc
Value
list
containing:
a
tibble
with summarized results (calledsummarized_metrics
)a
tibble
with random evaluations (random_evaluations
)
....................................................................
Based on the generated test set predictions,
a confusion matrix and ROC
curve are used to get the following:
ROC
:
AUC
, Lower CI
, and Upper CI
Note, that the ROC
curve is only computed when AUC
is enabled.
Confusion Matrix
:
Balanced Accuracy
,
Accuracy
,
F1
,
Sensitivity
, Specificity
,
Positive Predictive Value
,
Negative Predictive Value
,
Kappa
,
Detection Rate
,
Detection Prevalence
,
Prevalence
, and
MCC
(Matthews correlation coefficient).
....................................................................
The Summarized Results tibble
contains:
The Measure column indicates the statistical descriptor used on the evaluations.
The row where Measure == All_0
is the evaluation when all predictions are 0
.
The row where Measure == All_1
is the evaluation when all predictions are 1
.
The aggregated metrics.
....................................................................
The Random Evaluations tibble
contains:
The non-aggregated metrics.
A nested tibble
with the predictions and targets.
A list
of ROC curve objects (if computed).
A nested tibble
with the confusion matrix.
The Pos_
columns tells you whether a row is a
True Positive (TP
), True Negative (TN
), False Positive (FP
),
or False Negative (FN
), depending on which level is the "positive" class.
I.e. the level you wish to predict.
A nested Process information object with information about the evaluation.
Name of dependent variable.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other baseline functions:
baseline()
,
baseline_gaussian()
,
baseline_multinomial()
Examples
# Attach packages
library(cvms)
library(groupdata2) # partition()
library(dplyr) # %>% arrange()
# Data is part of cvms
data <- participant.scores
# Set seed for reproducibility
set.seed(1)
# Partition data
partitions <- partition(data, p = 0.7, list_out = TRUE)
train_set <- partitions[[1]]
test_set <- partitions[[2]]
# Create baseline evaluations
# Note: usually n=100 is a good setting
baseline_binomial(
test_data = test_set,
dependent_col = "diagnosis",
n = 2
)
# Parallelize evaluations
# Attach doParallel and register four cores
# Uncomment:
# library(doParallel)
# registerDoParallel(4)
# Make sure to uncomment the parallel argument
baseline_binomial(
test_data = test_set,
dependent_col = "diagnosis",
n = 4
#, parallel = TRUE # Uncomment
)
Create baseline evaluations for regression models
Description
Create a baseline evaluation of a test set.
In modelling, a baseline is a result that is meaningful to compare the results from our models to. In regression, we want our model to be better than a model without any predictors. If our model does not perform better than such a simple model, it's unlikely to be useful.
baseline_gaussian()
fits the intercept-only model (y ~ 1
) on `n`
random
subsets of `train_data`
and evaluates each model on `test_data`
. Additionally, it evaluates a
model fitted on all rows in `train_data`
.
Usage
baseline_gaussian(
test_data,
train_data,
dependent_col,
n = 100,
metrics = list(),
random_effects = NULL,
min_training_rows = 5,
min_training_rows_left_out = 3,
REML = FALSE,
parallel = FALSE
)
Arguments
test_data |
|
train_data |
|
dependent_col |
Name of dependent variable in the supplied test and training sets. |
n |
The number of random samplings of |
metrics |
E.g. You can enable/disable all metrics at once by including
The Also accepts the string |
random_effects |
Random effects structure for the baseline model. (Character) E.g. with |
min_training_rows |
Minimum number of rows in the random subsets of |
min_training_rows_left_out |
Minimum number of rows left out of the random subsets of I.e. a subset will maximally have the size:
|
REML |
Whether to use Restricted Maximum Likelihood. (Logical) |
parallel |
Whether to run the Remember to register a parallel backend first.
E.g. with |
Details
Packages used:
Models
Results
r2m : MuMIn::r.squaredGLMM
r2c : MuMIn::r.squaredGLMM
AIC : stats::AIC
AICc : MuMIn::AICc
BIC : stats::BIC
Value
list
containing:
a
tibble
with summarized results (calledsummarized_metrics
)a
tibble
with random evaluations (random_evaluations
)
....................................................................
The Summarized Results tibble
contains:
Average RMSE
, MAE
, NRMSE(IQR)
,
RRSE
, RAE
, RMSLE
.
See the additional metrics (disabled by default) at ?gaussian_metrics
.
The Measure column indicates the statistical descriptor used on the evaluations.
The row where Measure == All_rows
is the evaluation when the baseline model
is trained on all rows in `train_data`
.
The Training Rows column contains the aggregated number of rows used from `train_data`
,
when fitting the baseline models.
....................................................................
The Random Evaluations tibble
contains:
The non-aggregated metrics.
A nested tibble
with the predictions and targets.
A nested tibble
with the coefficients of the baseline models.
Number of training rows used when fitting the baseline model on the training set.
A nested Process information object with information about the evaluation.
Name of dependent variable.
Name of fixed effect (bias term only).
Random effects structure (if specified).
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other baseline functions:
baseline()
,
baseline_binomial()
,
baseline_multinomial()
Examples
# Attach packages
library(cvms)
library(groupdata2) # partition()
library(dplyr) # %>% arrange()
# Data is part of cvms
data <- participant.scores
# Set seed for reproducibility
set.seed(1)
# Partition data
partitions <- partition(data, p = 0.7, list_out = TRUE)
train_set <- partitions[[1]]
test_set <- partitions[[2]]
# Create baseline evaluations
# Note: usually n=100 is a good setting
baseline_gaussian(
test_data = test_set,
train_data = train_set,
dependent_col = "score",
random_effects = "(1|session)",
n = 2
)
# Parallelize evaluations
# Attach doParallel and register four cores
# Uncomment:
# library(doParallel)
# registerDoParallel(4)
# Make sure to uncomment the parallel argument
baseline_gaussian(
test_data = test_set,
train_data = train_set,
dependent_col = "score",
random_effects = "(1|session)",
n = 4
#, parallel = TRUE # Uncomment
)
Create baseline evaluations
Description
Create a baseline evaluation of a test set.
In modelling, a baseline is a result that
is meaningful to compare the results from our models to. For instance, in
classification, we usually want our results to be better than random guessing.
E.g. if we have three classes, we can expect an accuracy of 33.33%
, as for every
observation we have 1/3
chance of guessing the correct class. So our model should achieve
a higher accuracy than 33.33%
before it is more useful to us than guessing.
While this expected value is often fairly straightforward to find analytically, it
only represents what we can expect on average. In reality, it's possible to get far better
results than that by guessing.
baseline_multinomial()
finds the range of likely values by evaluating multiple sets
of random predictions and summarizing them with a set of useful descriptors.
Technically, it creates one-vs-all (binomial) baseline evaluations
for the `n`
sets of random predictions and summarizes them. Additionally,
sets of "all class x,y,z,..." predictions are evaluated.
Usage
baseline_multinomial(
test_data,
dependent_col,
n = 100,
metrics = list(),
random_generator_fn = runif,
parallel = FALSE
)
Arguments
test_data |
|
dependent_col |
Name of dependent variable in the supplied test and training sets. |
n |
The number of sets of random predictions to evaluate. (Default is |
metrics |
E.g. You can enable/disable all metrics at once by including
The Also accepts the string |
random_generator_fn |
Function for generating random numbers.
The The first argument must be the number of random numbers to generate, as no other arguments are supplied. To test the effect of using different functions,
see |
parallel |
Whether to run the Remember to register a parallel backend first.
E.g. with |
Details
Packages used:
Multiclass ROC
curve and AUC
:
pROC::multiclass.roc
Value
list
containing:
a
tibble
with summarized results (calledsummarized_metrics
)a
tibble
with random evaluations (random_evaluations
)a
tibble
with the summarized class level results (summarized_class_level_results
)
....................................................................
Macro metrics
Based on the generated predictions, one-vs-all (binomial) evaluations are performed and aggregated to get the following macro metrics:
Balanced Accuracy
,
F1
,
Sensitivity
,
Specificity
,
Positive Predictive Value
,
Negative Predictive Value
,
Kappa
,
Detection Rate
,
Detection Prevalence
, and
Prevalence
.
In general, the metrics mentioned in
binomial_metrics()
can be enabled as macro metrics
(excluding MCC
, AUC
, Lower CI
,
Upper CI
, and the AIC/AICc/BIC
metrics).
These metrics also has a weighted average
version.
N.B. we also refer to the one-vs-all evaluations as the class level results.
Multiclass metrics
In addition, the Overall Accuracy
and multiclass
MCC
metrics are computed. Multiclass AUC
can be enabled but
is slow to calculate with many classes.
....................................................................
The Summarized Results tibble
contains:
Summary of the random evaluations.
How: The one-vs-all binomial evaluations are aggregated by repetition and summarized. Besides the
metrics from the binomial evaluations, it
also includes Overall Accuracy
and multiclass MCC
.
The Measure column indicates the statistical descriptor used on the evaluations.
The Mean, Median, SD, IQR, Max, Min,
NAs, and INFs measures describe the Random Evaluations tibble
,
while the CL_Max, CL_Min, CL_NAs, and
CL_INFs describe the Class Level results.
The rows where Measure == All_<<class name>>
are the evaluations when all
the observations are predicted to be in that class.
....................................................................
The Summarized Class Level Results tibble
contains:
The (nested) summarized results for each class, with the same metrics and descriptors as
the Summarized Results tibble
. Use tidyr::unnest
on the tibble
to inspect the results.
How: The one-vs-all evaluations are summarized by class.
The rows where Measure == All_0
are the evaluations when none of the observations
are predicted to be in that class, while the rows where Measure == All_1
are the
evaluations when all of the observations are predicted to be in that class.
....................................................................
The Random Evaluations tibble
contains:
The repetition results with the same metrics as the Summarized Results tibble
.
How: The one-vs-all evaluations are aggregated by repetition.
If a metric contains one or more NAs
in the one-vs-all evaluations, it
will lead to an NA
result for that repetition.
Also includes:
A nested tibble
with the one-vs-all binomial evaluations (Class Level Results),
including nested Confusion Matrices and the
Support column, which is a count of how many observations from the
class is in the test set.
A nested tibble
with the predictions and targets.
A list
of ROC curve objects.
A nested tibble
with the multiclass confusion matrix.
A nested Process information object with information about the evaluation.
Name of dependent variable.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other baseline functions:
baseline()
,
baseline_binomial()
,
baseline_gaussian()
Examples
# Attach packages
library(cvms)
library(groupdata2) # partition()
library(dplyr) # %>% arrange()
library(tibble)
# Data is part of cvms
data <- participant.scores
# Set seed for reproducibility
set.seed(1)
# Partition data
partitions <- partition(data, p = 0.7, list_out = TRUE)
train_set <- partitions[[1]]
test_set <- partitions[[2]]
# Create baseline evaluations
# Note: usually n=100 is a good setting
# Create some data with multiple classes
multiclass_data <- tibble(
"target" = rep(paste0("class_", 1:5), each = 10)
) %>%
dplyr::sample_n(35)
baseline_multinomial(
test_data = multiclass_data,
dependent_col = "target",
n = 4
)
# Parallelize evaluations
# Attach doParallel and register four cores
# Uncomment:
# library(doParallel)
# registerDoParallel(4)
# Make sure to uncomment the parallel argument
(mb <- baseline_multinomial(
test_data = multiclass_data,
dependent_col = "target",
n = 6
#, parallel = TRUE # Uncomment
))
# Inspect the summarized class level results
# for class_2
mb$summarized_class_level_results %>%
dplyr::filter(Class == "class_2") %>%
tidyr::unnest(Results)
# Multinomial with custom random generator function
# that creates very "certain" predictions
# (once softmax is applied)
rcertain <- function(n) {
(runif(n, min = 1, max = 100)^1.4) / 100
}
# Make sure to uncomment the parallel argument
baseline_multinomial(
test_data = multiclass_data,
dependent_col = "target",
n = 6,
random_generator_fn = rcertain
#, parallel = TRUE # Uncomment
)
Select metrics for binomial evaluation
Description
Enable/disable metrics for binomial evaluation. Can be supplied to the
`metrics`
argument in many of the cvms
functions.
Note: Some functions may have slightly different defaults than the ones supplied here.
Usage
binomial_metrics(
all = NULL,
balanced_accuracy = NULL,
accuracy = NULL,
f1 = NULL,
sensitivity = NULL,
specificity = NULL,
pos_pred_value = NULL,
neg_pred_value = NULL,
auc = NULL,
lower_ci = NULL,
upper_ci = NULL,
kappa = NULL,
mcc = NULL,
detection_rate = NULL,
detection_prevalence = NULL,
prevalence = NULL,
false_neg_rate = NULL,
false_pos_rate = NULL,
false_discovery_rate = NULL,
false_omission_rate = NULL,
threat_score = NULL,
aic = NULL,
aicc = NULL,
bic = NULL
)
Arguments
all |
Enable/disable all arguments at once. (Logical) Specifying other metrics will overwrite this, why you can
use ( |
balanced_accuracy |
|
accuracy |
|
f1 |
|
sensitivity |
|
specificity |
|
pos_pred_value |
|
neg_pred_value |
|
auc |
|
lower_ci |
|
upper_ci |
|
kappa |
|
mcc |
|
detection_rate |
|
detection_prevalence |
|
prevalence |
|
false_neg_rate |
|
false_pos_rate |
|
false_discovery_rate |
|
false_omission_rate |
|
threat_score |
|
aic |
AIC. (Default: FALSE) |
aicc |
AICc. (Default: FALSE) |
bic |
BIC. (Default: FALSE) |
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other evaluation functions:
confusion_matrix()
,
evaluate()
,
evaluate_residuals()
,
gaussian_metrics()
,
multinomial_metrics()
Examples
# Attach packages
library(cvms)
# Enable only Balanced Accuracy
binomial_metrics(all = FALSE, balanced_accuracy = TRUE)
# Enable all but Balanced Accuracy
binomial_metrics(all = TRUE, balanced_accuracy = FALSE)
# Disable Balanced Accuracy
binomial_metrics(balanced_accuracy = FALSE)
Generate model formulas by combining predictors
Description
Create model formulas with every combination
of your fixed effects, along with the dependent variable and random effects.
259,358
formulas have been precomputed with two- and three-way interactions
for up to 8
fixed effects, with up to 5
included effects per formula.
Uses the +
and *
operators, so lower order interactions are
automatically included.
Usage
combine_predictors(
dependent,
fixed_effects,
random_effects = NULL,
max_fixed_effects = 5,
max_interaction_size = 3,
max_effect_frequency = NULL
)
Arguments
dependent |
Name of dependent variable. (Character) |
fixed_effects |
Max. limit of A fixed effect name cannot contain: white spaces, Effects in sublists will be interchanged. This can be useful, when
we have multiple versions of a predictor (e.g. Example of interchangeable effects:
|
random_effects |
The random effects structure. (Character) Is appended to the model formulas. |
max_fixed_effects |
Maximum number of fixed effects in a model formula. (Integer) Max. limit of |
max_interaction_size |
Maximum number of effects in an interaction. (Integer) Max. limit of Use this to limit the A model formula can contain multiple interactions. |
max_effect_frequency |
Maximum number of times an effect is included in a formula string. |
Value
list
of model formulas.
E.g.:
c("y ~ x1 + (1|z)", "y ~ x2 + (1|z)",
"y ~ x1 + x2 + (1|z)", "y ~ x1 * x2 + (1|z)")
.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Examples
# Attach packages
library(cvms)
# Create effect names
dependent <- "y"
fixed_effects <- c("a", "b", "c")
random_effects <- "(1|e)"
# Create model formulas
combine_predictors(
dependent, fixed_effects,
random_effects
)
# Create effect names with interchangeable effects in sublists
fixed_effects <- list("a", list("b", "log_b"), "c")
# Create model formulas
combine_predictors(
dependent, fixed_effects,
random_effects
)
Compatible formula terms
Description
162,660
pairs of compatible terms for building model formulas with up to 15
fixed effects.
Format
A data.frame
with 162,660
rows and 5
variables:
- left
term, fixed effect or interaction, with fixed effects separated by "
*
"- right
term, fixed effect or interaction, with fixed effects separated by "
*
"- max_interaction_size
maximum interaction size in the two terms, up to
3
- num_effects
number of unique fixed effects in the two terms, up to
5
- min_num_fixed_effects
minimum number of fixed effects required to use a formula with the two terms, i.e. the index in the alphabet of the last of the alphabetically ordered effects (letters) in the two terms, so
4
ifleft == "A"
andright == "D"
Details
A term is either a fixed effect or an interaction between fixed effects (up to three-way), where
the effects are separated by the "*
" operator.
Two terms are compatible if they are not redundant,
meaning that both add a fixed effect to the formula. E.g. as the interaction
"x1 * x2 * x3"
expands to "x1 + x2 + x3 + x1 * x2 + x1 * x3 + x2 * x3 + x1 * x2 * x3"
,
the higher order interaction makes these "sub terms" redundant. Note: All terms are compatible with NA
.
Effects are represented by the first fifteen capital letters.
Used to generate the model formulas for combine_predictors
.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Create a confusion matrix
Description
Creates a confusion matrix from targets and predictions. Calculates associated metrics.
Multiclass results are based on one-vs-all evaluations.
Both regular averaging and weighted averaging are available. Also calculates the Overall Accuracy
.
Note: In most cases you should use evaluate()
instead. It has additional metrics and
works in magrittr
pipes (e.g. %>%
) and with dplyr::group_by()
.
confusion_matrix()
is more lightweight and may be preferred in programming when you don't need the extra stuff
in evaluate()
.
Usage
confusion_matrix(
targets,
predictions,
metrics = list(),
positive = 2,
c_levels = NULL,
do_one_vs_all = TRUE,
parallel = FALSE
)
Arguments
targets |
|
predictions |
|
metrics |
E.g. You can enable/disable all metrics at once by including
The Also accepts the string |
positive |
Level from E.g. if we have the levels Note: For reproducibility, it's preferable to specify the name directly, as
different |
c_levels |
N.B. the levels are sorted alphabetically. When |
do_one_vs_all |
Whether to perform one-vs-all evaluations when working with more than 2 classes (multiclass). If you are only interested in the confusion matrix, this allows you to skip most of the metric calculations. |
parallel |
Whether to perform the one-vs-all evaluations in parallel. (Logical) N.B. This only makes sense when you have a lot of classes or a very large dataset. Remember to register a parallel backend first.
E.g. with |
Details
The following formulas are used for calculating the metrics:
Sensitivity = TP / (TP + FN)
Specificity = TN / (TN + FP)
Pos Pred Value = TP / (TP + FP)
Neg Pred Value = TN / (TN + FN)
Balanced Accuracy = (Sensitivity + Specificity) / 2
Accuracy = (TP + TN) / (TP + TN + FP + FN)
Overall Accuracy = Correct / (Correct + Incorrect)
F1 = 2 * Pos Pred Value * Sensitivity / (Pos Pred Value + Sensitivity)
MCC = ((TP * TN) - (FP * FN)) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))
Note for MCC
: Formula is for the binary case. When the denominator is 0
,
we set it to 1
to avoid NaN
.
See the metrics
vignette for the multiclass version.
Detection Rate = TP / (TP + FN + TN + FP)
Detection Prevalence = (TP + FP) / (TP + FN + TN + FP)
Threat Score = TP / (TP + FN + FP)
False Neg Rate = 1 - Sensitivity
False Pos Rate = 1 - Specificity
False Discovery Rate = 1 - Pos Pred Value
False Omission Rate = 1 - Neg Pred Value
For Kappa the counts (TP
, TN
, FP
, FN
) are normalized to percentages (summing to 1).
Then the following is calculated:
p_observed = TP + TN
p_expected = (TN + FP) * (TN + FN) + (FN + TP) * (FP + TP)
Kappa = (p_observed - p_expected) / (1 - p_expected)
Value
tibble
with:
Nested confusion matrix (tidied version)
Nested confusion matrix (table)
The Positive Class.
Multiclass only: Nested Class Level Results with the two-class metrics, the nested confusion matrices, and the Support metric, which is a count of the class in the target column and is used for the weighted average metrics.
The following metrics are available (see `metrics`
):
Two classes or more
Metric | Name | Default |
Balanced Accuracy | "Balanced Accuracy" | Enabled |
Accuracy | "Accuracy" | Disabled |
F1 | "F1" | Enabled |
Sensitivity | "Sensitivity" | Enabled |
Specificity | "Specificity" | Enabled |
Positive Predictive Value | "Pos Pred Value" | Enabled |
Negative Predictive Value | "Neg Pred Value" | Enabled |
Kappa | "Kappa" | Enabled |
Matthews Correlation Coefficient | "MCC" | Enabled |
Detection Rate | "Detection Rate" | Enabled |
Detection Prevalence | "Detection Prevalence" | Enabled |
Prevalence | "Prevalence" | Enabled |
False Negative Rate | "False Neg Rate" | Disabled |
False Positive Rate | "False Pos Rate" | Disabled |
False Discovery Rate | "False Discovery Rate" | Disabled |
False Omission Rate | "False Omission Rate" | Disabled |
Threat Score | "Threat Score" | Disabled |
The Name column refers to the name used in the package.
This is the name in the output and when enabling/disabling in `metrics`
.
Three classes or more
The metrics mentioned above (excluding MCC
)
has a weighted average version (disabled by default; weighted by the Support).
In order to enable a weighted metric, prefix the metric name with "Weighted "
when specifying `metrics`
.
E.g. metrics = list("Weighted Accuracy" = TRUE)
.
Metric | Name | Default |
Overall Accuracy | "Overall Accuracy" | Enabled |
Weighted * | "Weighted *" | Disabled |
Multiclass MCC | "MCC" | Enabled |
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other evaluation functions:
binomial_metrics()
,
evaluate()
,
evaluate_residuals()
,
gaussian_metrics()
,
multinomial_metrics()
Examples
# Attach cvms
library(cvms)
# Two classes
# Create targets and predictions
targets <- c(0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1)
predictions <- c(1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0)
# Create confusion matrix with default metrics
cm <- confusion_matrix(targets, predictions)
cm
cm[["Confusion Matrix"]]
cm[["Table"]]
# Three classes
# Create targets and predictions
targets <- c(0, 1, 2, 1, 0, 1, 2, 1, 0, 1, 2, 1, 0)
predictions <- c(2, 1, 0, 2, 0, 1, 1, 2, 0, 1, 2, 0, 2)
# Create confusion matrix with default metrics
cm <- confusion_matrix(targets, predictions)
cm
cm[["Confusion Matrix"]]
cm[["Table"]]
# Enabling weighted accuracy
# Create confusion matrix with Weighted Accuracy enabled
cm <- confusion_matrix(targets, predictions,
metrics = list("Weighted Accuracy" = TRUE)
)
cm
Cross-validate regression models for model selection
Description
Cross-validate one or multiple linear or logistic regression
models at once. Perform repeated cross-validation.
Returns results in a tibble
for easy comparison,
reporting and further analysis.
See cross_validate_fn()
for use
with custom model functions.
Usage
cross_validate(
data,
formulas,
family,
fold_cols = ".folds",
control = NULL,
REML = FALSE,
cutoff = 0.5,
positive = 2,
metrics = list(),
preprocessing = NULL,
rm_nc = FALSE,
parallel = FALSE,
verbose = FALSE,
link = deprecated(),
models = deprecated(),
model_verbose = deprecated()
)
Arguments
data |
Must include one or more grouping factors for identifying folds
- as made with | |||||||||||
formulas |
Model formulas as strings. (Character) E.g. Can contain random effects. E.g. | |||||||||||
family |
Name of the family. (Character) Currently supports See | |||||||||||
fold_cols |
Name(s) of grouping factor(s) for identifying folds. (Character) Include names of multiple grouping factors for repeated cross-validation. | |||||||||||
control |
Construct control structures for mixed model fitting
(with | |||||||||||
REML |
Restricted Maximum Likelihood. (Logical) | |||||||||||
cutoff |
Threshold for predicted classes. (Numeric) N.B. Binomial models only | |||||||||||
positive |
Level from dependent variable to predict.
Either as character (preferable) or level index ( E.g. if we have the levels Note: For reproducibility, it's preferable to specify the name directly, as
different Used when calculating confusion matrix metrics and creating The N.B. Only affects evaluation metrics, not the model training or returned predictions. N.B. Binomial models only. | |||||||||||
metrics |
E.g. You can enable/disable all metrics at once by including
The Also accepts the string | |||||||||||
preprocessing |
Name of preprocessing to apply. Available preprocessings are:
The preprocessing parameters ( N.B. The preprocessings should not affect the results
to a noticeable degree, although | |||||||||||
rm_nc |
Remove non-converged models from output. (Logical) | |||||||||||
parallel |
Whether to cross-validate the Remember to register a parallel backend first.
E.g. with | |||||||||||
verbose |
Whether to message process information like the number of model instances to fit and which model function was applied. (Logical) | |||||||||||
link , models , model_verbose |
Deprecated. |
Details
Packages used:
Models
Gaussian: stats::lm
, lme4::lmer
Binomial: stats::glm
, lme4::glmer
Results
Shared
AIC
: stats::AIC
AICc
: MuMIn::AICc
BIC
: stats::BIC
Gaussian
r2m
: MuMIn::r.squaredGLMM
r2c
: MuMIn::r.squaredGLMM
Binomial
ROC and AUC
: pROC::roc
Value
tibble
with results for each model.
Shared across families
A nested tibble
with coefficients of the models from all iterations.
Number of total folds.
Number of fold columns.
Count of convergence warnings. Consider discarding models that did not converge on all iterations. Note: you might still see results, but these should be taken with a grain of salt!
Count of other warnings. These are warnings without keywords such as "convergence".
Count of Singular Fit messages.
See lme4::isSingular
for more information.
Nested tibble
with the warnings and messages caught for each model.
A nested Process information object with information about the evaluation.
Name of dependent variable.
Names of fixed effects.
Names of random effects, if any.
Nested tibble
with preprocessing parameters, if any.
—————————————————————-
Gaussian Results
—————————————————————-
Average RMSE
, MAE
, NRMSE(IQR)
,
RRSE
, RAE
, RMSLE
,
AIC
, AICc
,
and BIC
of all the iterations*,
omitting potential NAs from non-converged iterations.
Note that the Information Criterion metrics (AIC
, AICc
, and BIC
) are also averages.
See the additional metrics (disabled by default) at ?gaussian_metrics
.
A nested tibble
with the predictions and targets.
A nested tibble
with the non-averaged results from all iterations.
* In repeated cross-validation, the metrics are first averaged for each fold column (repetition) and then averaged again.
—————————————————————-
Binomial Results
—————————————————————-
Based on the collected predictions from the test folds*,
a confusion matrix and a ROC
curve are created to get the following:
ROC
:
AUC
, Lower CI
, and Upper CI
Confusion Matrix
:
Balanced Accuracy
,
F1
,
Sensitivity
,
Specificity
,
Positive Predictive Value
,
Negative Predictive Value
,
Kappa
,
Detection Rate
,
Detection Prevalence
,
Prevalence
, and
MCC
(Matthews correlation coefficient).
See the additional metrics (disabled by default) at
?binomial_metrics
.
Also includes:
A nested tibble
with predictions, predicted classes (depends on cutoff
), and the targets.
Note, that the predictions are not necessarily of the specified positive
class, but of
the model's positive class (second level of dependent variable, alphabetically).
The pROC::roc
ROC
curve object(s).
A nested tibble
with the confusion matrix/matrices.
The Pos_
columns tells you whether a row is a
True Positive (TP
), True Negative (TN
),
False Positive (FP
), or False Negative (FN
),
depending on which level is the "positive" class. I.e. the level you wish to predict.
A nested tibble
with the results from all fold columns.
The name of the Positive Class.
* In repeated cross-validation, an evaluation is made per fold column (repetition) and averaged.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Benjamin Hugh Zachariae
See Also
Other validation functions:
cross_validate_fn()
,
validate()
,
validate_fn()
Examples
# Attach packages
library(cvms)
library(groupdata2) # fold()
library(dplyr) # %>% arrange()
# Data is part of cvms
data <- participant.scores
# Set seed for reproducibility
set.seed(7)
# Fold data
data <- fold(
data,
k = 4,
cat_col = "diagnosis",
id_col = "participant"
) %>%
arrange(.folds)
#
# Cross-validate a single model
#
# Gaussian
cross_validate(
data,
formulas = "score~diagnosis",
family = "gaussian",
REML = FALSE
)
# Binomial
cross_validate(
data,
formulas = "diagnosis~score",
family = "binomial"
)
#
# Cross-validate multiple models
#
formulas <- c(
"score~diagnosis+(1|session)",
"score~age+(1|session)"
)
cross_validate(
data,
formulas = formulas,
family = "gaussian",
REML = FALSE
)
#
# Use parallelization
#
# Attach doParallel and register four cores
# Uncomment:
# library(doParallel)
# registerDoParallel(4)
# Cross-validate a list of model formulas in parallel
# Make sure to uncomment the parallel argument
cross_validate(
data,
formulas = formulas,
family = "gaussian"
#, parallel = TRUE # Uncomment
)
Cross-validate custom model functions for model selection
Description
Cross-validate your model function with one or multiple model formulas at once.
Perform repeated cross-validation. Preprocess the train/test split
within the cross-validation. Perform hyperparameter tuning with grid search.
Returns results in a tibble
for easy comparison,
reporting and further analysis.
Compared to cross_validate()
,
this function allows you supply a custom model function, a predict function,
a preprocess function and the hyperparameter values to cross-validate.
Supports regression and classification (binary and multiclass).
See `type`
.
Note that some metrics may not be computable for some types of model objects.
Usage
cross_validate_fn(
data,
formulas,
type,
model_fn,
predict_fn,
preprocess_fn = NULL,
preprocess_once = FALSE,
hyperparameters = NULL,
fold_cols = ".folds",
cutoff = 0.5,
positive = 2,
metrics = list(),
rm_nc = FALSE,
parallel = FALSE,
verbose = TRUE
)
Arguments
data |
Must include one or more grouping factors for identifying folds
- as made with | |||||||||||||||
formulas |
Model formulas as strings. (Character) Will be converted to E.g. Can contain random effects. E.g. | |||||||||||||||
type |
Type of evaluation to perform:
| |||||||||||||||
model_fn |
Model function that returns a fitted model object.
Will usually wrap an existing model function like Must have the following function arguments:
| |||||||||||||||
predict_fn |
Function for predicting the targets in the test folds/sets using the fitted model object.
Will usually wrap Must have the following function arguments:
Must return predictions in the following formats, depending on Binomial
N.B. When unsure whether a model type produces probabilities based off the alphabetic order of your classes, using 0 and 1 as classes in the dependent variable instead of the class names should increase the chance of getting probabilities of the right class. Gaussian
Multinomial
| |||||||||||||||
preprocess_fn |
Function for preprocessing the training and test sets. Can, for instance, be used to standardize both the training and test sets with the scaling and centering parameters from the training set. Must have the following function arguments:
Must return a
Additional elements in the returned The optional parameters
N.B. When | |||||||||||||||
preprocess_once |
Whether to apply the preprocessing once
(ignoring the formula and hyperparameters arguments in When preprocessing does not depend on the current formula or hyperparameters, we can do the preprocessing of each train/test split once, to save time. This may require holding a lot more data in memory though, why it is not the default setting. | |||||||||||||||
hyperparameters |
Either a Named list for grid searchAdd E.g.
|
lrn_rate | h_layers | drop_out |
0.1 | 10 | 0.65 |
0.1 | 1000 | 0.65 |
0.01 | 1000 | 0.63 |
... | ... | ... |
fold_cols
Name(s) of grouping factor(s) for identifying folds. (Character)
Include names of multiple grouping factors for repeated cross-validation.
cutoff
Threshold for predicted classes. (Numeric)
N.B. Binomial models only
positive
Level from dependent variable to predict.
Either as character (preferable) or level index (1
or 2
- alphabetically).
E.g. if we have the levels "cat"
and "dog"
and we want "dog"
to be the positive class,
we can either provide "dog"
or 2
, as alphabetically, "dog"
comes after "cat"
.
Note: For reproducibility, it's preferable to specify the name directly, as
different locales
may sort the levels differently.
Used when calculating confusion matrix metrics and creating ROC
curves.
The Process
column in the output can be used to verify this setting.
N.B. Only affects evaluation metrics, not the model training or returned predictions.
N.B. Binomial models only.
metrics
list
for enabling/disabling metrics.
E.g. list("RMSE" = FALSE)
would remove RMSE
from the regression results,
and list("Accuracy" = TRUE)
would add the regular Accuracy
metric
to the classification results.
Default values (TRUE
/FALSE
) will be used for the remaining available metrics.
You can enable/disable all metrics at once by including
"all" = TRUE/FALSE
in the list
. This is done prior to enabling/disabling
individual metrics, why f.i. list("all" = FALSE, "RMSE" = TRUE)
would return only the RMSE
metric.
The list
can be created with
gaussian_metrics()
,
binomial_metrics()
, or
multinomial_metrics()
.
Also accepts the string "all"
.
rm_nc
Remove non-converged models from output. (Logical)
parallel
Whether to cross-validate the list
of models in parallel. (Logical)
Remember to register a parallel backend first.
E.g. with doParallel::registerDoParallel
.
verbose
Whether to message process information like the number of model instances to fit. (Logical)
Details
Packages used:
Results
Shared
AIC : stats::AIC
AICc : MuMIn::AICc
BIC : stats::BIC
Gaussian
r2m : MuMIn::r.squaredGLMM
r2c : MuMIn::r.squaredGLMM
Binomial and Multinomial
ROC and related metrics:
Binomial: pROC::roc
Multinomial: pROC::multiclass.roc
Value
tibble
with results for each model.
N.B. The Fold column in the nested tibble
s contains the test fold in that train/test split.
Shared across families
A nested tibble
with coefficients of the models from all iterations. The coefficients
are extracted from the model object with parameters::model_parameters()
or
coef()
(with some restrictions on the output).
If these attempts fail, a default coefficients tibble
filled with NA
s is returned.
Nested tibble
with the used preprocessing parameters,
if a passed preprocess_fn
returns the parameters in a tibble
.
Number of total folds.
Number of fold columns.
Count of convergence warnings, using a limited set of keywords (e.g. "convergence"). If a convergence warning does not contain one of these keywords, it will be counted with other warnings. Consider discarding models that did not converge on all iterations. Note: you might still see results, but these should be taken with a grain of salt!
Nested tibble
with the warnings and messages caught for each model.
A nested Process information object with information about the evaluation.
Name of dependent variable.
Names of fixed effects.
Names of random effects, if any.
—————————————————————-
Gaussian Results
—————————————————————-
Average RMSE
, MAE
, NRMSE(IQR)
,
RRSE
, RAE
, RMSLE
of all the iterations*,
omitting potential NAs from non-converged iterations.
See the additional metrics (disabled by default) at ?gaussian_metrics
.
A nested tibble
with the predictions and targets.
A nested tibble
with the non-averaged results from all iterations.
* In repeated cross-validation, the metrics are first averaged for each fold column (repetition) and then averaged again.
—————————————————————-
Binomial Results
—————————————————————-
Based on the collected predictions from the test folds*,
a confusion matrix and a ROC
curve are created to get the following:
ROC
:
AUC
, Lower CI
, and Upper CI
Confusion Matrix
:
Balanced Accuracy
,
F1
,
Sensitivity
,
Specificity
,
Positive Predictive Value
,
Negative Predictive Value
,
Kappa
,
Detection Rate
,
Detection Prevalence
,
Prevalence
, and
MCC
(Matthews correlation coefficient).
See the additional metrics (disabled by default) at
?binomial_metrics
.
Also includes:
A nested tibble
with predictions, predicted classes (depends on cutoff
), and the targets.
Note, that the predictions are not necessarily of the specified positive
class, but of
the model's positive class (second level of dependent variable, alphabetically).
The pROC::roc
ROC
curve object(s).
A nested tibble
with the confusion matrix/matrices.
The Pos_
columns tells you whether a row is a
True Positive (TP
), True Negative (TN
),
False Positive (FP
), or False Negative (FN
),
depending on which level is the "positive" class. I.e. the level you wish to predict.
A nested tibble
with the results from all fold columns.
The name of the Positive Class.
* In repeated cross-validation, an evaluation is made per fold column (repetition) and averaged.
—————————————————————-
Multinomial Results
—————————————————————-
For each class, a one-vs-all binomial evaluation is performed. This creates
a Class Level Results tibble
containing the same metrics as the binomial results
described above (excluding MCC
, AUC
, Lower CI
and Upper CI
),
along with a count of the class in the target column (Support
).
These metrics are used to calculate the macro-averaged metrics. The nested class level results
tibble
is also included in the output tibble
,
and could be reported along with the macro and overall metrics.
The output tibble
contains the macro and overall metrics.
The metrics that share their name with the metrics in the nested
class level results tibble
are averages of those metrics
(note: does not remove NA
s before averaging).
In addition to these, it also includes the Overall Accuracy
and
the multiclass MCC
.
Note: Balanced Accuracy
is the macro-averaged metric,
not the macro sensitivity as sometimes used!
Other available metrics (disabled by default, see metrics
):
Accuracy
,
multiclass AUC
,
Weighted Balanced Accuracy
,
Weighted Accuracy
,
Weighted F1
,
Weighted Sensitivity
,
Weighted Sensitivity
,
Weighted Specificity
,
Weighted Pos Pred Value
,
Weighted Neg Pred Value
,
Weighted Kappa
,
Weighted Detection Rate
,
Weighted Detection Prevalence
, and
Weighted Prevalence
.
Note that the "Weighted" average metrics are weighted by the Support
.
Also includes:
A nested tibble
with the predictions, predicted classes, and targets.
A list
of ROC curve objects when AUC
is enabled.
A nested tibble
with the multiclass Confusion Matrix.
Class Level Results
Besides the binomial evaluation metrics and the Support
,
the nested class level results tibble
also contains a
nested tibble
with the Confusion Matrix from the one-vs-all evaluation.
The Pos_
columns tells you whether a row is a
True Positive (TP
), True Negative (TN
),
False Positive (FP
), or False Negative (FN
),
depending on which level is the "positive" class. In our case, 1
is the current class
and 0
represents all the other classes together.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other validation functions:
cross_validate()
,
validate()
,
validate_fn()
Examples
# Attach packages
library(cvms)
library(groupdata2) # fold()
library(dplyr) # %>% arrange() mutate()
# Note: More examples of custom functions can be found at:
# model_fn: model_functions()
# predict_fn: predict_functions()
# preprocess_fn: preprocess_functions()
# Data is part of cvms
data <- participant.scores
# Set seed for reproducibility
set.seed(7)
# Fold data
data <- fold(
data,
k = 4,
cat_col = "diagnosis",
id_col = "participant"
) %>%
mutate(diagnosis = as.factor(diagnosis)) %>%
arrange(.folds)
# Cross-validate multiple formulas
formulas_gaussian <- c(
"score ~ diagnosis",
"score ~ age"
)
formulas_binomial <- c(
"diagnosis ~ score",
"diagnosis ~ age"
)
#
# Gaussian
#
# Create model function that returns a fitted model object
lm_model_fn <- function(train_data, formula, hyperparameters) {
lm(formula = formula, data = train_data)
}
# Create predict function that returns the predictions
lm_predict_fn <- function(test_data, model, formula,
hyperparameters, train_data) {
stats::predict(
object = model,
newdata = test_data,
type = "response",
allow.new.levels = TRUE
)
}
# Cross-validate the model function
cross_validate_fn(
data,
formulas = formulas_gaussian,
type = "gaussian",
model_fn = lm_model_fn,
predict_fn = lm_predict_fn,
fold_cols = ".folds"
)
#
# Binomial
#
# Create model function that returns a fitted model object
glm_model_fn <- function(train_data, formula, hyperparameters) {
glm(formula = formula, data = train_data, family = "binomial")
}
# Create predict function that returns the predictions
glm_predict_fn <- function(test_data, model, formula,
hyperparameters, train_data) {
stats::predict(
object = model,
newdata = test_data,
type = "response",
allow.new.levels = TRUE
)
}
# Cross-validate the model function
cross_validate_fn(
data,
formulas = formulas_binomial,
type = "binomial",
model_fn = glm_model_fn,
predict_fn = glm_predict_fn,
fold_cols = ".folds"
)
#
# Support Vector Machine (svm)
# with hyperparameter tuning
#
# Only run if the `e1071` package is installed
if (requireNamespace("e1071", quietly = TRUE)){
# Create model function that returns a fitted model object
# We use the hyperparameters arg to pass in the kernel and cost values
svm_model_fn <- function(train_data, formula, hyperparameters) {
# Expected hyperparameters:
# - kernel
# - cost
if (!"kernel" %in% names(hyperparameters))
stop("'hyperparameters' must include 'kernel'")
if (!"cost" %in% names(hyperparameters))
stop("'hyperparameters' must include 'cost'")
e1071::svm(
formula = formula,
data = train_data,
kernel = hyperparameters[["kernel"]],
cost = hyperparameters[["cost"]],
scale = FALSE,
type = "C-classification",
probability = TRUE
)
}
# Create predict function that returns the predictions
svm_predict_fn <- function(test_data, model, formula,
hyperparameters, train_data) {
predictions <- stats::predict(
object = model,
newdata = test_data,
allow.new.levels = TRUE,
probability = TRUE
)
# Extract probabilities
probabilities <- dplyr::as_tibble(
attr(predictions, "probabilities")
)
# Return second column
probabilities[[2]]
}
# Specify hyperparameters to try
# The optional ".n" samples 4 combinations
svm_hparams <- list(
".n" = 4,
"kernel" = c("linear", "radial"),
"cost" = c(1, 5, 10)
)
# Cross-validate the model function
cv <- cross_validate_fn(
data,
formulas = formulas_binomial,
type = "binomial",
model_fn = svm_model_fn,
predict_fn = svm_predict_fn,
hyperparameters = svm_hparams,
fold_cols = ".folds"
)
cv
# The `HParams` column has the nested hyperparameter values
cv %>%
select(Dependent, Fixed, HParams, `Balanced Accuracy`, F1, AUC, MCC) %>%
tidyr::unnest(cols = "HParams") %>%
arrange(desc(`Balanced Accuracy`), desc(F1))
#
# Use parallelization
# The below examples show the speed gains when running in parallel
#
# Attach doParallel and register four cores
# Uncomment:
# library(doParallel)
# registerDoParallel(4)
# Specify hyperparameters such that we will
# cross-validate 20 models
hparams <- list(
"kernel" = c("linear", "radial"),
"cost" = 1:5
)
# Cross-validate a list of 20 models in parallel
# Make sure to uncomment the parallel argument
system.time({
cross_validate_fn(
data,
formulas = formulas_gaussian,
type = "gaussian",
model_fn = svm_model_fn,
predict_fn = svm_predict_fn,
hyperparameters = hparams,
fold_cols = ".folds"
#, parallel = TRUE # Uncomment
)
})
# Cross-validate a list of 20 models sequentially
system.time({
cross_validate_fn(
data,
formulas = formulas_gaussian,
type = "gaussian",
model_fn = svm_model_fn,
predict_fn = svm_predict_fn,
hyperparameters = hparams,
fold_cols = ".folds"
#, parallel = TRUE # Uncomment
)
})
} # closes `e1071` package check
Create a list of dynamic font color settings for plots
Description
Creates a list of dynamic font color settings for plotting with cvms
plotting functions.
Specify separate colors below and above a given value threshold.
NOTE: This is experimental and will likely change.
Usage
dynamic_font_color_settings(
threshold = NULL,
by = "counts",
all = NULL,
counts = NULL,
normalized = NULL,
row_percentages = NULL,
col_percentages = NULL,
invert_arrows = NULL
)
Arguments
threshold |
The threshold at which the color changes. |
by |
The value to check against |
all |
Set same color settings for all fonts at once. Takes a character vector with two hex code strings (low, high). Example: ‘c(’#000', '#fff')'. |
counts , normalized , row_percentages , col_percentages |
Set color settings for the individual font. Takes a character vector with two hex code strings (low, high). Example: ‘c(’#000', '#fff')'. Specifying colors for specific fonts overrides the settings specified in
|
invert_arrows |
String specifying when to invert the color of the arrow icons based on the threshold.
One of { |
Value
List of settings.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other plotting functions:
font()
,
plot_confusion_matrix()
,
plot_metric_density()
,
plot_probabilities()
,
plot_probabilities_ecdf()
,
sum_tile_settings()
Evaluate your model's performance
Description
Evaluate your model's predictions on a set of evaluation metrics.
Create ID-aggregated evaluations by multiple methods.
Currently supports regression and classification
(binary and multiclass). See `type`
.
Usage
evaluate(
data,
target_col,
prediction_cols,
type,
id_col = NULL,
id_method = "mean",
apply_softmax = FALSE,
cutoff = 0.5,
positive = 2,
metrics = list(),
include_predictions = TRUE,
parallel = FALSE,
models = deprecated()
)
Arguments
data |
MultinomialWhen Probabilities (Preferable)One column per class with the probability of that class. The columns should have the name of their class, as they are named in the target column. E.g.:
ClassesA single column of type
BinomialWhen Probabilities (Preferable)One column with the probability of class being the second class alphabetically (1 if classes are 0 and 1). E.g.:
Note: At the alphabetical ordering of the class labels, they are of type ClassesA single column of type
Note: The prediction column will be converted to the probability GaussianWhen
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
target_col |
Name of the column with the true classes/values in When | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
prediction_cols |
Name(s) of column(s) with the predictions. Columns can be either numeric or character depending on which format is chosen.
See | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
type |
Type of evaluation to perform:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
id_col |
Name of ID column to aggregate predictions by. N.B. Current methods assume that the target class/value is constant within the IDs. N.B. When aggregating by ID, some metrics may be disabled. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
id_method |
Method to use when aggregating predictions by ID.
Either When meanThe average prediction (value or probability) is calculated per ID and evaluated. This method assumes that the target class/value is constant within the IDs. majorityThe most predicted class per ID is found and evaluated. In case of a tie,
the winning classes share the probability (e.g. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
apply_softmax |
Whether to apply the softmax function to the
prediction columns when N.B. Multinomial models only. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
cutoff |
Threshold for predicted classes. (Numeric) N.B. Binomial models only. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
positive |
Level from dependent variable to predict.
Either as character (preferable) or level index ( E.g. if we have the levels Note: For reproducibility, it's preferable to specify the name directly, as
different Used when calculating confusion matrix metrics and creating The N.B. Only affects the evaluation metrics. Does NOT affect what the probabilities are of (always the second class alphabetically). N.B. Binomial models only. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
metrics |
E.g. You can enable/disable all metrics at once by including
The Also accepts the string | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
include_predictions |
Whether to include the predictions
in the output as a nested | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
parallel |
Whether to run evaluations in parallel,
when | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
models |
Deprecated. |
Details
Packages used:
Binomial and Multinomial:
ROC
and AUC
:
Binomial: pROC::roc
Multinomial: pROC::multiclass.roc
Value
—————————————————————-
Gaussian Results
—————————————————————-
tibble
containing the following metrics by default:
Average RMSE
, MAE
, NRMSE(IQR)
,
RRSE
, RAE
, RMSLE
.
See the additional metrics (disabled by default) at
?gaussian_metrics
.
Also includes:
A nested tibble
with the Predictions and targets.
A nested Process information object with information about the evaluation.
—————————————————————-
Binomial Results
—————————————————————-
tibble
with the following evaluation metrics, based on a
confusion matrix
and a ROC
curve fitted to the predictions:
Confusion Matrix
:
Balanced Accuracy
,
Accuracy
,
F1
,
Sensitivity
,
Specificity
,
Positive Predictive Value
,
Negative Predictive Value
,
Kappa
,
Detection Rate
,
Detection Prevalence
,
Prevalence
, and
MCC
(Matthews correlation coefficient).
ROC
:
AUC
, Lower CI
, and Upper CI
Note, that the ROC
curve is only computed if AUC
is enabled. See metrics
.
Also includes:
A nested tibble
with the predictions and targets.
A list
of ROC curve objects (if computed).
A nested tibble
with the confusion matrix.
The Pos_
columns tells you whether a row is a
True Positive (TP
), True Negative (TN
),
False Positive (FP
), or False Negative (FN
),
depending on which level is the "positive
" class.
I.e. the level you wish to predict.
A nested Process information object with information about the evaluation.
—————————————————————-
Multinomial Results
—————————————————————-
For each class, a one-vs-all binomial evaluation is performed. This creates
a Class Level Results tibble
containing the same metrics as the binomial results
described above (excluding Accuracy
, MCC
, AUC
, Lower CI
and Upper CI
),
along with a count of the class in the target column (Support
).
These metrics are used to calculate the macro-averaged metrics.
The nested class level results tibble
is also included in the output tibble
,
and could be reported along with the macro and overall metrics.
The output tibble
contains the macro and overall metrics.
The metrics that share their name with the metrics in the nested
class level results tibble
are averages of those metrics
(note: does not remove NA
s before averaging).
In addition to these, it also includes the Overall Accuracy
and
the multiclass MCC
.
Note: Balanced Accuracy
is the macro-averaged metric,
not the macro sensitivity as sometimes used!
Other available metrics (disabled by default, see metrics
):
Accuracy
,
multiclass AUC
,
Weighted Balanced Accuracy
,
Weighted Accuracy
,
Weighted F1
,
Weighted Sensitivity
,
Weighted Sensitivity
,
Weighted Specificity
,
Weighted Pos Pred Value
,
Weighted Neg Pred Value
,
Weighted Kappa
,
Weighted Detection Rate
,
Weighted Detection Prevalence
, and
Weighted Prevalence
.
Note that the "Weighted" average metrics are weighted by the Support
.
When having a large set of classes, consider keeping AUC
disabled.
Also includes:
A nested tibble
with the Predictions and targets.
A list
of ROC curve objects when AUC
is enabled.
A nested tibble
with the multiclass Confusion Matrix.
A nested Process information object with information about the evaluation.
Class Level Results
Besides the binomial evaluation metrics and the Support
,
the nested class level results tibble
also contains a
nested tibble
with the Confusion Matrix from the one-vs-all evaluation.
The Pos_
columns tells you whether a row is a
True Positive (TP
), True Negative (TN
),
False Positive (FP
), or False Negative (FN
),
depending on which level is the "positive" class. In our case, 1
is the current class
and 0
represents all the other classes together.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other evaluation functions:
binomial_metrics()
,
confusion_matrix()
,
evaluate_residuals()
,
gaussian_metrics()
,
multinomial_metrics()
Examples
# Attach packages
library(cvms)
library(dplyr)
# Load data
data <- participant.scores
# Fit models
gaussian_model <- lm(age ~ diagnosis, data = data)
binomial_model <- glm(diagnosis ~ score, data = data)
# Add predictions
data[["gaussian_predictions"]] <- predict(gaussian_model, data,
type = "response",
allow.new.levels = TRUE
)
data[["binomial_predictions"]] <- predict(binomial_model, data,
allow.new.levels = TRUE
)
# Gaussian evaluation
evaluate(
data = data, target_col = "age",
prediction_cols = "gaussian_predictions",
type = "gaussian"
)
# Binomial evaluation
evaluate(
data = data, target_col = "diagnosis",
prediction_cols = "binomial_predictions",
type = "binomial"
)
#
# Multinomial
#
# Create a tibble with predicted probabilities and targets
data_mc <- multiclass_probability_tibble(
num_classes = 3, num_observations = 45,
apply_softmax = TRUE, FUN = runif,
class_name = "class_",
add_targets = TRUE
)
class_names <- paste0("class_", 1:3)
# Multinomial evaluation
evaluate(
data = data_mc, target_col = "Target",
prediction_cols = class_names,
type = "multinomial"
)
#
# ID evaluation
#
# Gaussian ID evaluation
# Note that 'age' is the same for all observations
# of a participant
evaluate(
data = data, target_col = "age",
prediction_cols = "gaussian_predictions",
id_col = "participant",
type = "gaussian"
)
# Binomial ID evaluation
evaluate(
data = data, target_col = "diagnosis",
prediction_cols = "binomial_predictions",
id_col = "participant",
id_method = "mean", # alternatively: "majority"
type = "binomial"
)
# Multinomial ID evaluation
# Add IDs and new targets (must be constant within IDs)
data_mc[["Target"]] <- NULL
data_mc[["ID"]] <- rep(1:9, each = 5)
id_classes <- tibble::tibble(
"ID" = 1:9,
"Target" = sample(x = class_names, size = 9, replace = TRUE)
)
data_mc <- data_mc %>%
dplyr::left_join(id_classes, by = "ID")
# Perform ID evaluation
evaluate(
data = data_mc, target_col = "Target",
prediction_cols = class_names,
id_col = "ID",
id_method = "mean", # alternatively: "majority"
type = "multinomial"
)
#
# Training and evaluating a multinomial model with nnet
#
# Only run if `nnet` is installed
if (requireNamespace("nnet", quietly = TRUE)){
# Create a data frame with some predictors and a target column
class_names <- paste0("class_", 1:4)
data_for_nnet <- multiclass_probability_tibble(
num_classes = 3, # Here, number of predictors
num_observations = 30,
apply_softmax = FALSE,
FUN = rnorm,
class_name = "predictor_"
) %>%
dplyr::mutate(Target = sample(
class_names,
size = 30,
replace = TRUE
))
# Train multinomial model using the nnet package
mn_model <- nnet::multinom(
"Target ~ predictor_1 + predictor_2 + predictor_3",
data = data_for_nnet
)
# Predict the targets in the dataset
# (we would usually use a test set instead)
predictions <- predict(
mn_model,
data_for_nnet,
type = "probs"
) %>%
dplyr::as_tibble()
# Add the targets
predictions[["Target"]] <- data_for_nnet[["Target"]]
# Evaluate predictions
evaluate(
data = predictions,
target_col = "Target",
prediction_cols = class_names,
type = "multinomial"
)
}
Evaluate residuals from a regression task
Description
Calculates a large set of error metrics from regression residuals.
Note: In most cases you should use evaluate()
instead.
It works in magrittr
pipes (e.g. %>%
) and with
dplyr::group_by()
.
evaluate_residuals()
is more lightweight and may be preferred in
programming when you don't need the extra stuff
in evaluate()
.
Usage
evaluate_residuals(data, target_col, prediction_col, metrics = list())
Arguments
data |
|
target_col |
Name of the column with the true values in |
prediction_col |
Name of column with the predicted values in |
metrics |
E.g. You can enable/disable all metrics at once by including
The Also accepts the string |
Details
The metric formulas are listed in 'The Available Metrics' vignette.
Value
tibble
data.frame
with the calculated metrics.
The following metrics are available (see `metrics`
):
Metric | Name | Default |
Mean Absolute Error | "MAE" | Enabled |
Root Mean Square Error | "RMSE" | Enabled |
Normalized RMSE (by target range) | "NRMSE(RNG)" | Disabled |
Normalized RMSE (by target IQR) | "NRMSE(IQR)" | Enabled |
Normalized RMSE (by target STD) | "NRMSE(STD)" | Disabled |
Normalized RMSE (by target mean) | "NRMSE(AVG)" | Disabled |
Relative Squared Error | "RSE" | Disabled |
Root Relative Squared Error | "RRSE" | Enabled |
Relative Absolute Error | "RAE" | Enabled |
Root Mean Squared Log Error | "RMSLE" | Enabled |
Mean Absolute Log Error | "MALE" | Disabled |
Mean Absolute Percentage Error | "MAPE" | Disabled |
Mean Squared Error | "MSE" | Disabled |
Total Absolute Error | "TAE" | Disabled |
Total Squared Error | "TSE" | Disabled |
The Name column refers to the name used in the package.
This is the name in the output and when enabling/disabling in `metrics`
.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other evaluation functions:
binomial_metrics()
,
confusion_matrix()
,
evaluate()
,
gaussian_metrics()
,
multinomial_metrics()
Examples
# Attach packages
library(cvms)
data <- data.frame(
"targets" = rnorm(100, 14.7, 3.6),
"predictions" = rnorm(100, 13.2, 4.6)
)
evaluate_residuals(
data = data,
target_col = "targets",
prediction_col = "predictions"
)
Create a list of font settings for plots
Description
Creates a list of font settings for plotting with cvms plotting functions.
Some arguments can take either the value to use directly OR a function that takes one argument (vector with the values to set a font for; e.g., the counts, percentages, etc.) and returns the value(s) to use for each element. Such a function could for instance specify different font colors for different background intensities.
NOTE: This is experimental and could change.
Usage
font(
size = NULL,
color = NULL,
alpha = NULL,
nudge_x = NULL,
nudge_y = NULL,
angle = NULL,
family = NULL,
fontface = NULL,
hjust = NULL,
vjust = NULL,
lineheight = NULL,
digits = NULL,
prefix = NULL,
suffix = NULL
)
Arguments
size , color , alpha , nudge_x , nudge_y , angle , family , fontface , hjust , vjust , lineheight |
Either the value to pass directly to
|
digits |
Number of digits to round to. If negative, no rounding will take place. |
prefix |
A string prefix. |
suffix |
A string suffix. |
Value
List of settings.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other plotting functions:
dynamic_font_color_settings()
,
plot_confusion_matrix()
,
plot_metric_density()
,
plot_probabilities()
,
plot_probabilities_ecdf()
,
sum_tile_settings()
Select metrics for Gaussian evaluation
Description
Enable/disable metrics for Gaussian evaluation. Can be supplied to the
`metrics`
argument in many of the cvms
functions.
Note: Some functions may have slightly different defaults than the ones supplied here.
Usage
gaussian_metrics(
all = NULL,
rmse = NULL,
mae = NULL,
nrmse_rng = NULL,
nrmse_iqr = NULL,
nrmse_std = NULL,
nrmse_avg = NULL,
rae = NULL,
rse = NULL,
rrse = NULL,
rmsle = NULL,
male = NULL,
mape = NULL,
mse = NULL,
tae = NULL,
tse = NULL,
r2m = NULL,
r2c = NULL,
aic = NULL,
aicc = NULL,
bic = NULL
)
Arguments
all |
Enable/disable all arguments at once. (Logical) Specifying other metrics will overwrite this, why you can
use ( |
rmse |
Root Mean Square Error. |
mae |
Mean Absolute Error. |
nrmse_rng |
Normalized Root Mean Square Error (by target range). |
nrmse_iqr |
Normalized Root Mean Square Error (by target interquartile range). |
nrmse_std |
Normalized Root Mean Square Error (by target standard deviation). |
nrmse_avg |
Normalized Root Mean Square Error (by target mean). |
rae |
Relative Absolute Error. |
rse |
Relative Squared Error. |
rrse |
Root Relative Squared Error. |
rmsle |
Root Mean Square Log Error. |
male |
Mean Absolute Log Error. |
mape |
Mean Absolute Percentage Error. |
mse |
Mean Square Error. |
tae |
Total Absolute Error |
tse |
Total Squared Error. |
r2m |
Marginal R-squared. |
r2c |
Conditional R-squared. |
aic |
Akaike Information Criterion. |
aicc |
Corrected Akaike Information Criterion. |
bic |
Bayesian Information Criterion. |
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other evaluation functions:
binomial_metrics()
,
confusion_matrix()
,
evaluate()
,
evaluate_residuals()
,
multinomial_metrics()
Examples
# Attach packages
library(cvms)
# Enable only RMSE
gaussian_metrics(all = FALSE, rmse = TRUE)
# Enable all but RMSE
gaussian_metrics(all = TRUE, rmse = FALSE)
# Disable RMSE
gaussian_metrics(rmse = FALSE)
Examples of model_fn functions
Description
Examples of model functions that can be used in
cross_validate_fn()
.
They can either be used directly or be starting points.
The update_hyperparameters()
function
updates the list of hyperparameters with default values for missing hyperparameters.
You can also specify required hyperparameters.
Usage
model_functions(name)
Arguments
name |
Name of model to get model function for, as it appears in the following list:
|
Value
A function with the following form:
function(train_data, formula, hyperparameters) {
# Return fitted model object
}
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other example functions:
predict_functions()
,
preprocess_functions()
,
update_hyperparameters()
Find the data points that were hardest to predict
Description
Finds the data points that, overall, were the most challenging to predict,
based on a prediction metric.
Usage
most_challenging(
data,
type,
obs_id_col = "Observation",
target_col = "Target",
prediction_cols = ifelse(type == "gaussian", "Prediction", "Predicted Class"),
threshold = 0.15,
threshold_is = "percentage",
metric = NULL,
cutoff = 0.5
)
Arguments
data |
Predictions can be passed as values, predicted classes or predicted probabilities: N.B. Adds MultinomialWhen Probabilities (Preferable)One column per class with the probability of that class. The columns should have the name of their class, as they are named in the target column. E.g.:
ClassesA single column of type
BinomialWhen Probabilities (Preferable)One column with the probability of class being the second class alphabetically ("dog" if classes are "cat" and "dog"). E.g.:
Note: At the alphabetical ordering of the class labels, they are of type ClassesA single column of type
GaussianWhen
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
type |
Type of task used to get the predictions:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
obs_id_col |
Name of column with observation IDs. This will be used to aggregate the performance of each observation. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
target_col |
Name of column with the true classes/values in | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
prediction_cols |
Name(s) of column(s) with the predictions. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
threshold |
Threshold to filter observations by. Depends on The Gaussianthreshold_is "percentage"(Approximate) percentage of the observations with the largest root mean square errors to return. threshold_is "score"Observations with a root mean square error larger than or equal to the Binomial, Multinomialthreshold_is "percentage"(Approximate) percentage of the observations to return with:
threshold_is "score"
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
threshold_is |
Either | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
metric |
The metric to use. If Binomial, Multinomial
When one prediction column with predicted classes is passed,
the default is When one or more prediction columns with predicted probabilities are passed,
the default is GaussianIgnored. Always uses | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
cutoff |
Threshold for predicted classes. (Numeric) N.B. Binomial only. |
Value
data.frame
with the most challenging observations and their metrics.
`>=` / `<=`
denotes the threshold as score.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Examples
# Attach packages
library(cvms)
library(dplyr)
##
## Multinomial
##
# Find the most challenging data points (per classifier)
# in the predicted.musicians dataset
# which resembles the "Predictions" tibble from the evaluation results
# Passing predicted probabilities
# Observations with 30% highest MAE scores
most_challenging(
predicted.musicians,
obs_id_col = "ID",
prediction_cols = c("A", "B", "C", "D"),
type = "multinomial",
threshold = 0.30
)
# Observations with 25% highest Cross Entropy scores
most_challenging(
predicted.musicians,
obs_id_col = "ID",
prediction_cols = c("A", "B", "C", "D"),
type = "multinomial",
threshold = 0.25,
metric = "Cross Entropy"
)
# Passing predicted classes
# Observations with 30% lowest Accuracy scores
most_challenging(
predicted.musicians,
obs_id_col = "ID",
prediction_cols = "Predicted Class",
type = "multinomial",
threshold = 0.30
)
# The 40% lowest-scoring on accuracy per classifier
predicted.musicians %>%
dplyr::group_by(Classifier) %>%
most_challenging(
obs_id_col = "ID",
prediction_cols = "Predicted Class",
type = "multinomial",
threshold = 0.40
)
# Accuracy scores below 0.05
most_challenging(
predicted.musicians,
obs_id_col = "ID",
type = "multinomial",
threshold = 0.05,
threshold_is = "score"
)
##
## Binomial
##
# Subset the predicted.musicians
binom_data <- predicted.musicians %>%
dplyr::filter(Target %in% c("A","B")) %>%
dplyr::rename(Prediction = B)
# Passing probabilities
# Observations with 30% highest MAE
most_challenging(
binom_data,
obs_id_col = "ID",
type = "binomial",
prediction_cols = "Prediction",
threshold = 0.30
)
# Observations with 30% highest Cross Entropy
most_challenging(
binom_data,
obs_id_col = "ID",
type = "binomial",
prediction_cols = "Prediction",
threshold = 0.30,
metric = "Cross Entropy"
)
# Passing predicted classes
# Observations with 30% lowest Accuracy scores
most_challenging(
binom_data,
obs_id_col = "ID",
type = "binomial",
prediction_cols = "Predicted Class",
threshold = 0.30
)
##
## Gaussian
##
set.seed(1)
df <- data.frame(
"Observation" = rep(1:10, n = 3),
"Target" = rnorm(n = 30, mean = 25, sd = 5),
"Prediction" = rnorm(n = 30, mean = 27, sd = 7)
)
# The 20% highest RMSE scores
most_challenging(
df,
type = "gaussian",
threshold = 0.2
)
# RMSE scores above 9
most_challenging(
df,
type = "gaussian",
threshold = 9,
threshold_is = "score"
)
Generate a multiclass probability tibble
Description
Generate a tibble
with random numbers containing one column per specified class.
When the softmax function is applied, the numbers become probabilities that sum to 1
row-wise.
Optionally, add columns with targets and predicted classes.
Usage
multiclass_probability_tibble(
num_classes,
num_observations,
apply_softmax = TRUE,
FUN = runif,
class_name = "class_",
add_predicted_classes = FALSE,
add_targets = FALSE
)
Arguments
num_classes |
The number of classes. Also the number of columns in the |
num_observations |
The number of observations. Also the number of rows in the |
apply_softmax |
Whether to apply the |
FUN |
Function for generating random numbers. The first argument must be the number of random numbers to generate, as no other arguments are supplied. |
class_name |
The prefix for the column names. The column index is appended. |
add_predicted_classes |
Whether to add a column with the predicted classes. (Logical) The class with the highest value is the predicted class. |
add_targets |
Whether to add a column with randomly selected target classes. (Logical) |
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Examples
# Attach cvms
library(cvms)
# Create a tibble with 5 classes and 10 observations
# Apply softmax to make sure the probabilities sum to 1
multiclass_probability_tibble(
num_classes = 5,
num_observations = 10,
apply_softmax = TRUE
)
# Using the rnorm function to generate the random numbers
multiclass_probability_tibble(
num_classes = 5,
num_observations = 10,
apply_softmax = TRUE,
FUN = rnorm
)
# Add targets and predicted classes
multiclass_probability_tibble(
num_classes = 5,
num_observations = 10,
apply_softmax = TRUE,
FUN = rnorm,
add_predicted_classes = TRUE,
add_targets = TRUE
)
# Creating a custom generator function that
# exponentiates the numbers to create more "certain" predictions
rcertain <- function(n) {
(runif(n, min = 1, max = 100)^1.4) / 100
}
multiclass_probability_tibble(
num_classes = 5,
num_observations = 10,
apply_softmax = TRUE,
FUN = rcertain
)
Select metrics for multinomial evaluation
Description
Enable/disable metrics for multinomial evaluation. Can be supplied to the
`metrics`
argument in many of the cvms
functions.
Note: Some functions may have slightly different defaults than the ones supplied here.
Usage
multinomial_metrics(
all = NULL,
overall_accuracy = NULL,
balanced_accuracy = NULL,
w_balanced_accuracy = NULL,
accuracy = NULL,
w_accuracy = NULL,
f1 = NULL,
w_f1 = NULL,
sensitivity = NULL,
w_sensitivity = NULL,
specificity = NULL,
w_specificity = NULL,
pos_pred_value = NULL,
w_pos_pred_value = NULL,
neg_pred_value = NULL,
w_neg_pred_value = NULL,
auc = NULL,
kappa = NULL,
w_kappa = NULL,
mcc = NULL,
detection_rate = NULL,
w_detection_rate = NULL,
detection_prevalence = NULL,
w_detection_prevalence = NULL,
prevalence = NULL,
w_prevalence = NULL,
false_neg_rate = NULL,
w_false_neg_rate = NULL,
false_pos_rate = NULL,
w_false_pos_rate = NULL,
false_discovery_rate = NULL,
w_false_discovery_rate = NULL,
false_omission_rate = NULL,
w_false_omission_rate = NULL,
threat_score = NULL,
w_threat_score = NULL,
aic = NULL,
aicc = NULL,
bic = NULL
)
Arguments
all |
Enable/disable all arguments at once. (Logical) Specifying other metrics will overwrite this, why you can
use ( |
overall_accuracy |
|
balanced_accuracy |
|
w_balanced_accuracy |
|
accuracy |
|
w_accuracy |
|
f1 |
|
w_f1 |
|
sensitivity |
|
w_sensitivity |
|
specificity |
|
w_specificity |
|
pos_pred_value |
|
w_pos_pred_value |
|
neg_pred_value |
|
w_neg_pred_value |
|
auc |
|
kappa |
|
w_kappa |
|
mcc |
Multiclass Matthews Correlation Coefficient. |
detection_rate |
|
w_detection_rate |
|
detection_prevalence |
|
w_detection_prevalence |
|
prevalence |
|
w_prevalence |
|
false_neg_rate |
|
w_false_neg_rate |
|
false_pos_rate |
|
w_false_pos_rate |
|
false_discovery_rate |
|
w_false_discovery_rate |
|
false_omission_rate |
|
w_false_omission_rate |
|
threat_score |
|
w_threat_score |
|
aic |
AIC. (Default: FALSE) |
aicc |
AICc. (Default: FALSE) |
bic |
BIC. (Default: FALSE) |
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other evaluation functions:
binomial_metrics()
,
confusion_matrix()
,
evaluate()
,
evaluate_residuals()
,
gaussian_metrics()
Examples
# Attach packages
library(cvms)
# Enable only Balanced Accuracy
multinomial_metrics(all = FALSE, balanced_accuracy = TRUE)
# Enable all but Balanced Accuracy
multinomial_metrics(all = TRUE, balanced_accuracy = FALSE)
# Disable Balanced Accuracy
multinomial_metrics(balanced_accuracy = FALSE)
Musician groups
Description
Made-up data on 60 musicians in 4 groups for multiclass classification.
Format
A data.frame
with 60
rows and 9
variables:
- ID
Musician identifier, 60 levels
- Age
Age of the musician. Between 17 and 66 years.
- Class
The class of the musician. One of
"A"
,"B"
,"C"
, and"D"
.- Height
Height of the musician. Between
146
and196
centimeters.- Drums
Whether the musician plays drums.
0
= No,1
= Yes.- Bass
Whether the musician plays bass.
0
= No,1
= Yes.- Guitar
Whether the musician plays guitar.
0
= No,1
= Yes.- Keys
Whether the musician plays keys.
0
= No,1
= Yes.- Vocals
Whether the musician sings.
0
= No,1
= Yes.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
predicted.musicians
Participant scores
Description
Made-up experiment data with 10 participants and two diagnoses. Test scores for 3 sessions per participant, where participants improve their scores each session.
Format
A data.frame
with 30
rows and 5
variables:
- participant
participant identifier, 10 levels
- age
age of the participant, in years
- diagnosis
diagnosis of the participant, either 1 or 0
- score
test score of the participant, on a 0-100 scale
- session
testing session identifier, 1 to 3
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Plot a confusion matrix
Description
Creates a ggplot2
object representing a confusion matrix with counts,
overall percentages, row percentages and column percentages. An extra row and column with sum tiles and the
total count can be added.
The confusion matrix can be created with evaluate()
. See `Examples`
.
While this function is intended to be very flexible (hence the large number of arguments),
the defaults should work in most cases for most users. See the Examples
.
NEW: Our Plot Confusion Matrix web application allows using this function without code. Select from multiple design templates or make your own.
Usage
plot_confusion_matrix(
conf_matrix,
target_col = "Target",
prediction_col = "Prediction",
counts_col = "N",
sub_col = NULL,
class_order = NULL,
add_sums = FALSE,
add_counts = TRUE,
add_normalized = TRUE,
add_row_percentages = TRUE,
add_col_percentages = TRUE,
diag_percentages_only = FALSE,
rm_zero_percentages = TRUE,
rm_zero_text = TRUE,
add_zero_shading = TRUE,
amount_3d_effect = 1,
add_arrows = TRUE,
counts_on_top = FALSE,
palette = "Blues",
intensity_by = "counts",
intensity_lims = NULL,
intensity_beyond_lims = "truncate",
theme_fn = ggplot2::theme_minimal,
place_x_axis_above = TRUE,
rotate_y_text = TRUE,
digits = 1,
font_counts = font(),
font_normalized = font(),
font_row_percentages = font(),
font_col_percentages = font(),
dynamic_font_colors = dynamic_font_color_settings(),
arrow_size = 0.048,
arrow_color = "black",
arrow_nudge_from_text = 0.065,
tile_border_color = NA,
tile_border_size = 0.1,
tile_border_linetype = "solid",
sums_settings = sum_tile_settings(),
darkness = 0.8
)
Arguments
conf_matrix |
Confusion matrix E.g. for a binary classification:
As created with the various evaluation functions in An additional Note: If you supply the results from | ||||||||||||||||
target_col |
Name of column with target levels. | ||||||||||||||||
prediction_col |
Name of column with prediction levels. | ||||||||||||||||
counts_col |
Name of column with a count for each combination of the target and prediction levels. | ||||||||||||||||
sub_col |
Name of column with text to replace the bottom text
('counts' by default or 'normalized' when It simply replaces the text, so all settings will still be called
e.g. | ||||||||||||||||
class_order |
Names of the classes in | ||||||||||||||||
add_sums |
Add tiles with the row/column sums. Also adds a total count tile. (Logical) The appearance of these tiles can be specified in Note: Adding the sum tiles with a palette requires the | ||||||||||||||||
add_counts |
Add the counts to the middle of the tiles. (Logical) | ||||||||||||||||
add_normalized |
Normalize the counts to percentages and add to the middle of the tiles. (Logical) | ||||||||||||||||
add_row_percentages |
Add the row percentages, i.e. how big a part of its row the tile makes up. (Logical) By default, the row percentage is placed to the right of the tile, rotated 90 degrees. | ||||||||||||||||
add_col_percentages |
Add the column percentages, i.e. how big a part of its column the tile makes up. (Logical) By default, the row percentage is placed at the bottom of the tile. | ||||||||||||||||
diag_percentages_only |
Whether to only have row and column percentages in the diagonal tiles. (Logical) | ||||||||||||||||
rm_zero_percentages |
Whether to remove row and column percentages when the count is | ||||||||||||||||
rm_zero_text |
Whether to remove counts and normalized percentages when the count is | ||||||||||||||||
add_zero_shading |
Add image of skewed lines to zero-tiles. (Logical) Note: Adding the zero-shading requires the Note: For large confusion matrices, this can be very slow. Consider turning off until the final plotting. | ||||||||||||||||
amount_3d_effect |
Amount of 3D effect (tile overlay) to add.
Passed as whole number from Note: The overlay may not fit the tiles in many-class cases that haven't been tested. If the boxes do not overlap properly, simply turn it off. Note: For large confusion matrices, this can be very slow. Consider turning off until the final plotting. | ||||||||||||||||
add_arrows |
Add the arrows to the row and col percentages. (Logical) Note: Adding the arrows requires the | ||||||||||||||||
counts_on_top |
Switch the counts and normalized counts, such that the counts are on top. (Logical) | ||||||||||||||||
palette |
Color scheme. Passed directly to Try these palettes: Alternatively, pass a named list with limits of a custom gradient as e.g.
| ||||||||||||||||
intensity_by |
The measure that should control the color intensity of the tiles.
Either For 'normalized', 'row_percentages', and 'col_percentages', the color limits
become Note: When For the 'log*' and 'arcsinh' versions, the log/arcsinh transformed counts are used. Note: In 'log*' transformed counts, 0-counts are set to '0', why they won't be distinguishable from 1-counts. | ||||||||||||||||
intensity_lims |
A specific range of values for the color intensity of
the tiles. Given as a numeric vector with This allows having the same intensity scale across plots for better comparison of prediction sets. | ||||||||||||||||
intensity_beyond_lims |
What to do with values beyond the
| ||||||||||||||||
theme_fn |
The | ||||||||||||||||
place_x_axis_above |
Move the x-axis text to the top and reverse the levels such that the "correct" diagonal goes from top left to bottom right. (Logical) | ||||||||||||||||
rotate_y_text |
Whether to rotate the y-axis text to be vertical instead of horizontal. (Logical) | ||||||||||||||||
digits |
Number of digits to round to (percentages only). Set to a negative number for no rounding. Can be set for each font individually via the | ||||||||||||||||
font_counts |
| ||||||||||||||||
font_normalized |
| ||||||||||||||||
font_row_percentages |
| ||||||||||||||||
font_col_percentages |
| ||||||||||||||||
dynamic_font_colors |
A list of settings for using dynamic font colors
based on the value of the counts/normalized. Allows changing the font colors
when the background tiles are too dark, etc.
Can be provided with Individual thresholds can be set for the different fonts/values via the
Specifying colors for specific fonts overrides the "all" values for those fonts. | ||||||||||||||||
arrow_size |
Size of arrow icons. (Numeric) Is divided by | ||||||||||||||||
arrow_color |
Color of arrow icons. One of | ||||||||||||||||
arrow_nudge_from_text |
Distance from the percentage text to the arrow. (Numeric) | ||||||||||||||||
tile_border_color |
Color of the tile borders. Passed as | ||||||||||||||||
tile_border_size |
Size of the tile borders. Passed as | ||||||||||||||||
tile_border_linetype |
Linetype for the tile borders. Passed as | ||||||||||||||||
sums_settings |
A list of settings for the appearance of the sum tiles.
Can be provided with | ||||||||||||||||
darkness |
How dark the darkest colors should be, between Technically, a lower value increases the upper limit in
|
Details
Inspired by Antoine Sachet's answer at https://stackoverflow.com/a/53612391/11832955
Value
A ggplot2
object representing a confusion matrix.
Color intensity depends on either the counts (default) or the overall percentages.
By default, each tile has the normalized count (overall percentage) and count in the middle, the column percentage at the bottom, and the row percentage to the right and rotated 90 degrees.
In the "correct" diagonal (upper left to bottom right, by default), the column percentages are the class-level sensitivity scores, while the row percentages are the class-level positive predictive values.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other plotting functions:
dynamic_font_color_settings()
,
font()
,
plot_metric_density()
,
plot_probabilities()
,
plot_probabilities_ecdf()
,
sum_tile_settings()
Examples
# Attach cvms
library(cvms)
library(ggplot2)
# Two classes
# Create targets and predictions data frame
data <- data.frame(
"target" = c("A", "B", "A", "B", "A", "B", "A", "B",
"A", "B", "A", "B", "A", "B", "A", "A"),
"prediction" = c("B", "B", "A", "A", "A", "B", "B", "B",
"B", "B", "A", "B", "A", "A", "A", "A"),
stringsAsFactors = FALSE
)
# Evaluate predictions and create confusion matrix
evaluation <- evaluate(
data = data,
target_col = "target",
prediction_cols = "prediction",
type = "binomial"
)
# Inspect confusion matrix tibble
evaluation[["Confusion Matrix"]][[1]]
# Plot confusion matrix
# Supply confusion matrix tibble directly
plot_confusion_matrix(evaluation[["Confusion Matrix"]][[1]])
# Plot first confusion matrix in evaluate() output
plot_confusion_matrix(evaluation)
## Not run:
# Add sum tiles
plot_confusion_matrix(evaluation, add_sums = TRUE)
## End(Not run)
# Add labels to diagonal row and column percentages
# This example assumes "B" is the positive class
# but you could write anything as prefix to the percentages
plot_confusion_matrix(
evaluation,
font_row_percentages = font(prefix=c("NPV = ", "", "", "PPV = ")),
font_col_percentages = font(prefix=c("Spec = ", "", "", "Sens = "))
)
# Dynamic font colors when background becomes too dark
# Also inverts the arrow colors
plot_confusion_matrix(
evaluation[["Confusion Matrix"]][[1]],
# Black and white theme
palette = list("low"="#ffffff", "high"="#000000"),
# Increase contrast
darkness = 1.0,
# Specify colors below and above threshold
dynamic_font_colors = dynamic_font_color_settings(
threshold = 30,
by = "normalized",
# Black at low values, white at high values
all = c('#000', '#fff'),
# White arrows above threshold
invert_arrows = "at_and_above"
)
)
# Three (or more) classes
# Create targets and predictions data frame
data <- data.frame(
"target" = c("A", "B", "C", "B", "A", "B", "C",
"B", "A", "B", "C", "B", "A"),
"prediction" = c("C", "B", "A", "C", "A", "B", "B",
"C", "A", "B", "C", "A", "C"),
stringsAsFactors = FALSE
)
# Evaluate predictions and create confusion matrix
evaluation <- evaluate(
data = data,
target_col = "target",
prediction_cols = "prediction",
type = "multinomial"
)
# Inspect confusion matrix tibble
evaluation[["Confusion Matrix"]][[1]]
# Plot confusion matrix
# Supply confusion matrix tibble directly
plot_confusion_matrix(evaluation[["Confusion Matrix"]][[1]])
# Plot first confusion matrix in evaluate() output
plot_confusion_matrix(evaluation)
## Not run:
# Add sum tiles
plot_confusion_matrix(evaluation, add_sums = TRUE)
## End(Not run)
# Counts only
plot_confusion_matrix(
evaluation[["Confusion Matrix"]][[1]],
add_normalized = FALSE,
add_row_percentages = FALSE,
add_col_percentages = FALSE
)
# Change color palette to green
# Change theme to `theme_light`.
plot_confusion_matrix(
evaluation[["Confusion Matrix"]][[1]],
palette = "Greens",
theme_fn = ggplot2::theme_light
)
## Not run:
# Change colors palette to custom gradient
# with a different gradient for sum tiles
plot_confusion_matrix(
evaluation[["Confusion Matrix"]][[1]],
palette = list("low" = "#B1F9E8", "high" = "#239895"),
sums_settings = sum_tile_settings(
palette = list("low" = "#e9e1fc", "high" = "#BE94E6")
),
add_sums = TRUE
)
## End(Not run)
# The output is a ggplot2 object
# that you can add layers to
# Here we change the axis labels
plot_confusion_matrix(evaluation[["Confusion Matrix"]][[1]]) +
ggplot2::labs(x = "True", y = "Guess")
# Replace the bottom tile text
# with some information
# First extract confusion matrix
# Then add new column with text
cm <- evaluation[["Confusion Matrix"]][[1]]
cm[["Trials"]] <- c(
"(8/9)", "(3/9)", "(1/9)",
"(3/9)", "(7/9)", "(4/9)",
"(1/9)", "(2/9)", "(8/9)"
)
# Now plot with the `sub_col` argument specified
plot_confusion_matrix(cm, sub_col="Trials")
Density plot for a metric
Description
Creates a ggplot2
object with a density plot
for one of the columns in the passed data.frame
(s).
Note: In its current form, it is mainly intended as a quick way to visualize the results from cross-validations and baselines (random evaluations). It may change significantly in future versions.
Usage
plot_metric_density(
results = NULL,
baseline = NULL,
metric = "",
fill = c("darkblue", "lightblue"),
alpha = 0.6,
theme_fn = ggplot2::theme_minimal,
xlim = NULL
)
Arguments
results |
To only plot the baseline, set to |
baseline |
To only plot the results, set to |
metric |
Name of the metric column in |
fill |
Colors of the plotted distributions.
The first color is for the |
alpha |
Transparency of the distribution ( |
theme_fn |
The |
xlim |
Limits for the x-axis. Can be set to E.g. |
Value
A ggplot2
object with the density of a metric, possibly split
in 'Results' and 'Baseline'.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other plotting functions:
dynamic_font_color_settings()
,
font()
,
plot_confusion_matrix()
,
plot_probabilities()
,
plot_probabilities_ecdf()
,
sum_tile_settings()
Examples
# Attach packages
library(cvms)
library(dplyr)
# We will use the musicians and predicted.musicians datasets
musicians
predicted.musicians
# Set seed
set.seed(42)
# Create baseline for targets
bsl <- baseline_multinomial(
test_data = musicians,
dependent_col = "Class",
n = 20 # Normally 100
)
# Evaluate predictions grouped by classifier and fold column
eval <- predicted.musicians %>%
dplyr::group_by(Classifier, `Fold Column`) %>%
evaluate(
target_col = "Target",
prediction_cols = c("A", "B", "C", "D"),
type = "multinomial"
)
# Plot density of the Overall Accuracy metric
plot_metric_density(
results = eval,
baseline = bsl$random_evaluations,
metric = "Overall Accuracy",
xlim = c(0,1)
)
# The bulk of classifier results are much better than
# the baseline results
Plot predicted probabilities
Description
Creates a ggplot2
line plot object with the probabilities
of either the target classes or the predicted classes.
The observations are ordered by the highest probability.
TODO line geom: average probability per observation
TODO points geom: actual probabilities per observation
The meaning of the horizontal lines depend on the settings.
These are either recall scores, precision scores,
or accuracy scores, depending on the `probability_of`
and `apply_facet`
arguments.
Usage
plot_probabilities(
data,
target_col,
probability_cols,
predicted_class_col = NULL,
obs_id_col = NULL,
group_col = NULL,
probability_of = "target",
positive = 2,
order = "centered",
theme_fn = ggplot2::theme_minimal,
color_scale = ggplot2::scale_colour_brewer(palette = "Dark2"),
apply_facet = length(probability_cols) > 1,
smoothe = FALSE,
add_points = !is.null(obs_id_col),
add_hlines = TRUE,
add_caption = TRUE,
show_x_scale = FALSE,
line_settings = list(),
smoothe_settings = list(),
point_settings = list(),
hline_settings = list(),
facet_settings = list(),
ylim = c(0, 1)
)
Arguments
data |
Example for binary classification:
Example for multiclass classification:
You can have multiple rows per observation ID per group. If, for instance, we have run repeated cross-validation of 3 classifiers, we would have one predicted probability per fold column per classifier. As created with the various validation functions in | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
target_col |
Name of column with target levels. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
probability_cols |
Name of columns with predicted probabilities. For binary classification, this should be one column with the probability of the second class (alphabetically). For multiclass classification, this should be one column per class.
These probabilities must sum to | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
predicted_class_col |
Name of column with predicted classes. This is required when | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
obs_id_col |
Name of column with observation identifiers for grouping the x-axis.
When Use case: when you have multiple predicted probabilities per observation by a classifier (e.g. from repeated cross-validation). Can also be a grouping variable that you wish to aggregate. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
group_col |
Name of column with groups. The plot elements are split by these groups and can be identified by their color. E.g. the classifier responsible for the prediction. N.B. With more than | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
probability_of |
Whether to plot the probabilities of the
target classes ( For each row, we extract the probability of either the target class or the predicted class. Both are useful to plot, as they show the behavior of the classifier in a way a confusion matrix doesn't. One classifier might be very certain in its predictions (whether wrong or right), whereas another might be less certain. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
positive |
TODO | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
order |
How to order of the the probabilities. (Character) One of: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
theme_fn |
The | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
color_scale |
E.g. the output of
N.B. The number of colors in the object's palette should be at least the same as
the number of groups in the | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
apply_facet |
Whether to use
By default, faceting is applied when there are more than one probability column (multiclass). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
smoothe |
Whether to use Settings can be passed via the | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
add_points |
Add a point for each predicted probability.
These are grouped on the x-axis by the | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
add_hlines |
Add horizontal lines. (Logical) The meaning of these lines depends on the
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
add_caption |
Whether to add a caption explaining the plot. This is dynamically generated and intended as a starting point. (Logical) You can overwrite the text with | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
show_x_scale |
TODO | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
line_settings |
Named list of arguments for The Any argument not in the list will use its default value. Default: N.B. Ignored when | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
smoothe_settings |
Named list of arguments for The Any argument not in the list will use its default value. Default: N.B. Only used when | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
point_settings |
Named list of arguments for The Any argument not in the list will use its default value. Default: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
hline_settings |
Named list of arguments for The Any argument not in the list will use its default value. Default: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
facet_settings |
Named list of arguments for The Any argument not in the list will use its default value. Commonly set arguments are | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ylim |
Limits for the y-scale. |
Details
TODO
Value
A ggplot2
object with a faceted line plot. TODO
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other plotting functions:
dynamic_font_color_settings()
,
font()
,
plot_confusion_matrix()
,
plot_metric_density()
,
plot_probabilities_ecdf()
,
sum_tile_settings()
Examples
# Attach cvms
library(cvms)
library(ggplot2)
library(dplyr)
#
# Multiclass
#
# Plot probabilities of target classes
# From repeated cross-validation of three classifiers
# plot_probabilities(
# data = predicted.musicians,
# target_col = "Target",
# probability_cols = c("A", "B", "C", "D"),
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# obs_id_col = "ID",
# probability_of = "target"
# )
# Plot probabilities of predicted classes
# From repeated cross-validation of three classifiers
# plot_probabilities(
# data = predicted.musicians,
# target_col = "Target",
# probability_cols = c("A", "B", "C", "D"),
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# obs_id_col = "ID",
# probability_of = "prediction"
# )
# Center probabilities
# plot_probabilities(
# data = predicted.musicians,
# target_col = "Target",
# probability_cols = c("A", "B", "C", "D"),
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# obs_id_col = "ID",
# probability_of = "prediction",
# order = "centered"
# )
#
# Binary
#
# Filter the predicted.musicians dataset
# binom_data <- predicted.musicians %>%
# dplyr::filter(
# Target %in% c("A", "B")
# ) %>%
# # "B" is the second class alphabetically
# dplyr::rename(Probability = B) %>%
# dplyr::mutate(`Predicted Class` = ifelse(
# Probability > 0.5, "B", "A")) %>%
# dplyr::select(-dplyr::all_of(c("A","C","D")))
# Plot probabilities of predicted classes
# From repeated cross-validation of three classifiers
# plot_probabilities(
# data = binom_data,
# target_col = "Target",
# probability_cols = "Probability",
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# obs_id_col = "ID",
# probability_of = "target"
# )
# plot_probabilities(
# data = binom_data,
# target_col = "Target",
# probability_cols = "Probability",
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# obs_id_col = "ID",
# probability_of = "prediction",
# ylim = c(0.5, 1)
# )
Plot ECDF for the predicted probabilities
Description
Plots the empirical cumulative distribution function (ECDF) for the probabilities of either the target classes or the predicted classes.
Creates a ggplot2
with the stat_ecdf()
geom.
Usage
plot_probabilities_ecdf(
data,
target_col,
probability_cols,
predicted_class_col = NULL,
obs_id_col = NULL,
group_col = NULL,
probability_of = "target",
positive = 2,
theme_fn = ggplot2::theme_minimal,
color_scale = ggplot2::scale_colour_brewer(palette = "Dark2"),
apply_facet = length(probability_cols) > 1,
add_caption = TRUE,
ecdf_settings = list(),
facet_settings = list(),
xlim = c(0, 1)
)
Arguments
data |
Example for binary classification:
Example for multiclass classification:
As created with the various validation functions in | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
target_col |
Name of column with target levels. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
probability_cols |
Name of columns with predicted probabilities. For binary classification, this should be one column with the probability of the second class (alphabetically). For multiclass classification, this should be one column per class.
These probabilities must sum to | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
predicted_class_col |
Name of column with predicted classes. This is required when | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
obs_id_col |
Name of column with observation identifiers for averaging the
predicted probabilities per observation before computing the ECDF (when deemed meaningful).
When | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
group_col |
Name of column with groups. The plot elements are split by these groups and can be identified by their color. E.g. the classifier responsible for the prediction. N.B. With more than | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
probability_of |
Whether to plot the ECDF for the probabilities of the
target classes ( For each row, we extract the probability of either the target class or the predicted class. Both are useful to plot, as they show the behavior of the classifier in a way a confusion matrix doesn't. One classifier might be very certain in its predictions (whether wrong or right), whereas another might be less certain. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
positive |
TODO | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
theme_fn |
The | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
color_scale |
E.g. the output of
N.B. The number of colors in the object's palette should be at least the same as
the number of groups in the | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
apply_facet |
Whether to use
By default, faceting is applied when there are more than one probability column (multiclass). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
add_caption |
Whether to add a caption explaining the plot. This is dynamically generated and intended as a starting point. (Logical) You can overwrite the text with | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ecdf_settings |
Named list of arguments for The Any argument not in the list will use the default value set by Defaults: Common changes are to set | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
facet_settings |
Named list of arguments for The Any argument not in the list will use its default value. Commonly set arguments are | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
xlim |
Limits for the x-scale. |
Details
TODO
Value
A ggplot2
object with a faceted line plot. TODO
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other plotting functions:
dynamic_font_color_settings()
,
font()
,
plot_confusion_matrix()
,
plot_metric_density()
,
plot_probabilities()
,
sum_tile_settings()
Examples
# Attach cvms
library(cvms)
library(ggplot2)
library(dplyr)
#
# Multiclass
#
# TODO: Go through and rewrite comments and code!
# Plot probabilities of target classes
# From repeated cross-validation of three classifiers
# plot_probabilities_ecdf(
# data = predicted.musicians,
# target_col = "Target",
# probability_cols = c("A", "B", "C", "D"),
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# probability_of = "target"
# )
# Plot probabilities of predicted classes
# From repeated cross-validation of three classifiers
# plot_probabilities_ecdf(
# data = predicted.musicians,
# target_col = "Target",
# probability_cols = c("A", "B", "C", "D"),
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# probability_of = "prediction"
# )
#
# Binary
#
# Filter the predicted.musicians dataset
# binom_data <- predicted.musicians %>%
# dplyr::filter(
# Target %in% c("A", "B")
# ) %>%
# # "B" is the second class alphabetically
# dplyr::rename(Probability = B) %>%
# dplyr::mutate(`Predicted Class` = ifelse(
# Probability > 0.5, "B", "A")) %>%
# dplyr::select(-dplyr::all_of(c("A","C","D")))
# Plot probabilities of predicted classes
# From repeated cross-validation of three classifiers
# plot_probabilities_ecdf(
# data = binom_data,
# target_col = "Target",
# probability_cols = "Probability",
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# probability_of = "target"
# )
# plot_probabilities_ecdf(
# data = binom_data,
# target_col = "Target",
# probability_cols = "Probability",
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# probability_of = "prediction",
# xlim = c(0.5, 1)
# )
Precomputed formulas
Description
Fixed effect combinations for model formulas with/without two- and three-way interactions. Up to eight fixed effects in total with up to five fixed effects per formula.
Format
A data.frame
with 259,358
rows and 5
variables:
- formula_
combination of fixed effects, separated by "
+
" and "*
"- max_interaction_size
maximum interaction size in the formula, up to
3
- max_effect_frequency
maximum count of an effect in the formula, e.g. the
3
A's in"A * B + A * C + A * D"
- num_effects
number of unique effects included in the formula
- min_num_fixed_effects
minimum number of fixed effects required to use the formula, i.e. the index in the alphabet of the last of the alphabetically ordered effects (letters) in the formula, so
4
for the formula:"A + B + D"
Details
Effects are represented by the first eight capital letters.
Used by combine_predictors
.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Examples of predict_fn functions
Description
Examples of predict functions that can be used in
cross_validate_fn()
.
They can either be used directly or be starting points.
Usage
predict_functions(name)
Arguments
name |
Name of model to get predict function for, as it appears in the following table. The Model HParams column lists hyperparameters used in the respective model function.
|
Value
A function with the following form:
function(test_data, model, formula, hyperparameters, train_data) {
# Use model to predict test_data
# Return predictions
}
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other example functions:
model_functions()
,
preprocess_functions()
,
update_hyperparameters()
Predicted musician groups
Description
Predictions by 3 classifiers of the 4 classes in the
musicians
dataset.
Obtained with 5-fold stratified cross-validation (3 repetitions).
The three classifiers were fit using nnet::multinom
,
randomForest::randomForest
, and e1071::svm
.
Format
A data.frame
with 540
rows and 10
variables:
- Classifier
The applied classifier. One of
"nnet_multinom"
,"randomForest"
, and"e1071_svm"
.- Fold Column
The fold column name. Each is a unique 5-fold split. One of
".folds_1"
,".folds_2"
, and".folds_3"
.- Fold
The fold.
1
to5
.- ID
Musician identifier, 60 levels
- Target
The actual class of the musician. One of
"A"
,"B"
,"C"
, and"D"
.- A
The probability of class
"A"
.- B
The probability of class
"B"
.- C
The probability of class
"C"
.- D
The probability of class
"D"
.- Predicted Class
The predicted class. The argmax of the four probability columns.
Details
Used formula: "Class ~ Height + Age + Drums + Bass + Guitar + Keys + Vocals"
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
musicians
Examples
# Attach packages
library(cvms)
library(dplyr)
# Evaluate each fold column
predicted.musicians %>%
dplyr::group_by(Classifier, `Fold Column`) %>%
evaluate(target_col = "Target",
prediction_cols = c("A", "B", "C", "D"),
type = "multinomial")
# Overall ID evaluation
# I.e. if we average all 9 sets of predictions,
# how well did we predict the targets?
overall_id_eval <- predicted.musicians %>%
evaluate(target_col = "Target",
prediction_cols = c("A", "B", "C", "D"),
type = "multinomial",
id_col = "ID")
overall_id_eval
# Plot the confusion matrix
plot_confusion_matrix(overall_id_eval$`Confusion Matrix`[[1]])
Examples of preprocess_fn functions
Description
Examples of preprocess functions that can be used in
cross_validate_fn()
and
validate_fn()
.
They can either be used directly or be starting points.
The examples use recipes
,
but you can also use caret::preProcess()
or
similar functions.
In these examples, the preprocessing will only affect the numeric predictors.
You may prefer to hardcode a formula like "y ~ ."
(where
y
is your dependent variable) as that will allow you to set
'preprocess_one' to TRUE
in cross_validate_fn()
and validate_fn()
and save time.
Usage
preprocess_functions(name)
Arguments
name |
Name of preprocessing function as it appears in the following list:
|
Value
A function with the following form:
function(train_data, test_data, formula, hyperparameters) {
# Preprocess train_data and test_data
# Return a list with the preprocessed datasets
# and optionally a data frame with preprocessing parameters
list(
"train" = train_data,
"test" = test_data,
"parameters" = tidy_parameters
)
}
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other example functions:
model_functions()
,
predict_functions()
,
update_hyperparameters()
A set of process information object constructors
Description
Classes for storing process information from prediction evaluations.
Used internally.
Usage
process_info_binomial(
data,
target_col,
prediction_cols,
id_col,
cat_levels,
positive,
cutoff,
locale = NULL
)
## S3 method for class 'process_info_binomial'
print(x, ...)
## S3 method for class 'process_info_binomial'
as.character(x, ...)
process_info_multinomial(
data,
target_col,
prediction_cols,
pred_class_col,
id_col,
cat_levels,
apply_softmax,
locale = NULL
)
## S3 method for class 'process_info_multinomial'
print(x, ...)
## S3 method for class 'process_info_multinomial'
as.character(x, ...)
process_info_gaussian(data, target_col, prediction_cols, id_col, locale = NULL)
## S3 method for class 'process_info_gaussian'
print(x, ...)
## S3 method for class 'process_info_gaussian'
as.character(x, ...)
Arguments
data |
Data frame. |
target_col |
Name of target column. |
prediction_cols |
Names of prediction columns. |
id_col |
Name of ID column. |
cat_levels |
Categorical levels (classes). |
positive |
Name of the positive class. |
cutoff |
The cutoff used to get class predictions from probabilities. |
locale |
The locale when performing the evaluation. Relevant when any sorting has been performed. |
x |
a process info object used to select a method. |
... |
further arguments passed to or from other methods. |
pred_class_col |
Name of predicted classes column. |
apply_softmax |
Whether softmax has been applied. |
Value
List with relevant information.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Reconstruct model formulas from results tibbles
Description
In the (cross-)validation results from functions like
cross_validate()
,
the model formulas have been split into the columns
Dependent
, Fixed
and Random
.
Quickly reconstruct the model formulas from these columns.
Usage
reconstruct_formulas(results, topn = NULL)
Arguments
results |
Must contain at least the columns |
topn |
Number of top rows to return. Simply applies |
Value
list
of model formulas.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Render Table of Contents
Description
From: https://gist.github.com/gadenbuie/c83e078bf8c81b035e32c3fc0cf04ee8
Usage
render_toc(
filename,
toc_header_name = "Table of Contents",
base_level = NULL,
toc_depth = 3
)
Arguments
filename |
Name of RMarkdown or Markdown document |
toc_header_name |
The table of contents header name. If specified, any header with this format will not be included in the TOC. Set to 'NULL' to include the TOC itself in the TOC (but why?). |
base_level |
Starting level of the lowest header level. Any headers prior to the first header at the base_level are dropped silently. |
toc_depth |
Maximum depth for TOC, relative to base_level. Default is 'toc_depth = 3', which results in a TOC of at most 3 levels. |
Details
A simple function to extract headers from an RMarkdown or Markdown document and build a table of contents. Returns a markdown list with links to the headers using [pandoc header identifiers](http://pandoc.org/MANUAL.html#header-identifiers).
WARNING: This function only works with hash-tag headers.
Because this function returns only the markdown list, the header for the Table of Contents itself must be manually included in the text. Use 'toc_header_name' to exclude the table of contents header from the TOC, or set to 'NULL' for it to be included.
Usage
Just drop in a chunk where you want the toc to appear (set 'echo=FALSE'):
render_toc("/path/to/the/file.Rmd")
Select model definition columns
Description
Select the columns that define the models, such as the formula terms and hyperparameters.
If an expected column is not in the `results`
tibble
, it is simply ignored.
Usage
select_definitions(results, unnest_hparams = TRUE, additional_includes = NULL)
Arguments
results |
Results |
unnest_hparams |
Whether to unnest the |
additional_includes |
Names of additional columns to select. (Character) |
Value
The model definition columns from the results tibble
.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Select columns with evaluation metrics and model definitions
Description
When reporting results, we might not want all
the nested tibble
s and process information columns.
This function selects the evaluation metrics and model formulas only.
If an expected column is not in the `results`
tibble
, it is simply ignored.
Usage
select_metrics(results, include_definitions = TRUE, additional_includes = NULL)
Arguments
results |
Results |
include_definitions |
Whether to include the |
additional_includes |
Names of additional columns to select. (Character) |
Value
The results tibble
with only the metric and model definition columns.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Simplify formula with inline functions
Description
Extracts all variables from a formula object and creates a new formula with all predictor variables added together without the inline functions.
E.g.:
y ~ x*z + log(a) + (1|b)
becomes
y ~ x + z + a + b
.
This is useful when passing a formula to recipes::recipe()
for preprocessing a dataset, as used in the
preprocess_functions()
.
Usage
simplify_formula(formula, data = NULL, string_out = FALSE)
Arguments
formula |
Formula object. If a string is passed, it will be converted with When a side only contains a An intercept ( |
data |
|
string_out |
Whether to return as a string. (Logical) |
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Examples
# Attach cvms
library(cvms)
# Create formula
f1 <- "y ~ x*z + log(a) + (1|b)"
# Simplify formula (as string)
simplify_formula(f1)
# Simplify formula (as formula)
simplify_formula(as.formula(f1))
Create a list of settings for the sum tiles in plot_confusion_matrix()
Description
Creates a list of settings for plotting the column/row sums
in plot_confusion_matrix()
.
The `tc_`
in the arguments refers to the total count tile.
NOTE: This is very experimental and will likely change.
Usage
sum_tile_settings(
palette = NULL,
label = NULL,
tile_fill = NULL,
font_counts_color = NULL,
font_normalized_color = NULL,
dynamic_font_colors = dynamic_font_color_settings(),
tile_border_color = NULL,
tile_border_size = NULL,
tile_border_linetype = NULL,
tc_tile_fill = NULL,
tc_font_color = NULL,
tc_tile_border_color = NULL,
tc_tile_border_size = NULL,
tc_tile_border_linetype = NULL,
intensity_by = NULL,
intensity_lims = NULL,
intensity_beyond_lims = NULL,
font_color = deprecated()
)
Arguments
palette |
Color scheme to use for sum tiles.
Should be different from the Passed directly to Try these palettes: Alternatively, pass a named list with limits of a custom gradient as e.g.
Note: When |
label |
The label to use for the sum column and the sum row. |
font_counts_color , font_normalized_color |
Color of the text in the tiles with the column and row sums.
Either the value directly passed to |
dynamic_font_colors |
A list of settings for using dynamic font colors
based on the value of the counts/normalized. Allows changing the font colors
when the background tiles are too dark, etc.
Can be provided with Individual thresholds can be set for the different fonts/values via the
Specifying colors for specific fonts overrides the "all" values for those fonts. |
tc_tile_fill , tile_fill |
Specific background color for the tiles. Passed as If specified, the |
tc_font_color |
Color of the text in the total count tile. |
tc_tile_border_color , tile_border_color |
Color of the tile borders. Passed as |
tc_tile_border_size , tile_border_size |
Size of the tile borders. Passed as |
tc_tile_border_linetype , tile_border_linetype |
Linetype for the tile borders. Passed as |
intensity_by |
The measure that should control the color intensity of the tiles.
Either For 'normalized', 'row_percentages', and 'col_percentages', the color limits
become Note: When For the 'log*' and 'arcsinh' versions, the log/arcsinh transformed counts are used. Note: In 'log*' transformed counts, 0-counts are set to '0', why they won't be distinguishable from 1-counts. |
intensity_lims |
A specific range of values for the color intensity of
the tiles. Given as a numeric vector with This allows having the same intensity scale across plots for better comparison of prediction sets. |
intensity_beyond_lims |
What to do with values beyond the
|
font_color |
Deprecated. |
Value
List of settings.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other plotting functions:
dynamic_font_color_settings()
,
font()
,
plot_confusion_matrix()
,
plot_metric_density()
,
plot_probabilities()
,
plot_probabilities_ecdf()
Summarize metrics with common descriptors
Description
Summarizes all numeric columns. Counts the NA
s and Inf
s in the columns.
Usage
summarize_metrics(data, cols = NULL, na.rm = TRUE, inf.rm = TRUE)
Arguments
data |
|
cols |
Names of columns to summarize. Non-numeric columns are ignored. (Character) |
na.rm |
Whether to remove |
inf.rm |
Whether to remove |
Value
tibble
where each row is a descriptor of the column.
The Measure column contains the name of the descriptor.
The NAs row is a count of the NA
s in the column.
The INFs row is a count of the Inf
s in the column.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Examples
# Attach packages
library(cvms)
library(dplyr)
df <- data.frame("a" = c("a", "a", "a", "b", "b", "b", "c", "c", "c"),
"b" = c(0.8, 0.6, 0.3, 0.2, 0.4, 0.5, 0.8, 0.1, 0.5),
"c" = c(0.2, 0.3, 0.4, 0.6, 0.5, 0.8, 0.1, 0.8, 0.3))
# Summarize all numeric columns
summarize_metrics(df)
# Summarize column "b"
summarize_metrics(df, cols = "b")
Check and update hyperparameters
Description
Checks if the required hyperparameters are present and throws an error when it is not the case.
Inserts the missing hyperparameters with the supplied default values.
For managing hyperparameters in custom model functions for
cross_validate_fn()
or
validate_fn()
.
Usage
update_hyperparameters(..., hyperparameters, .required = NULL)
Arguments
... |
Default values for missing hyperparameters. E.g.:
|
hyperparameters |
|
.required |
Names of required hyperparameters. If any of these
are not present in the hyperparameters, an |
Value
A named list
with the updated hyperparameters.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other example functions:
model_functions()
,
predict_functions()
,
preprocess_functions()
Examples
# Attach packages
library(cvms)
# Create a list of hyperparameters
hparams <- list(
"kernel" = "radial",
"scale" = TRUE
)
# Update hyperparameters with defaults
# Only 'cost' is changed as it's missing
update_hyperparameters(
cost = 10,
kernel = "linear",
"scale" = FALSE,
hyperparameters = hparams
)
# 'cost' is required
# throws error
if (requireNamespace("xpectr", quietly = TRUE)){
xpectr::capture_side_effects(
update_hyperparameters(
kernel = "linear",
"scale" = FALSE,
hyperparameters = hparams,
.required = "cost"
)
)
}
Validate regression models on a test set
Description
Train linear or logistic regression models on a training set and validate it by
predicting a test/validation set.
Returns results in a tibble
for easy reporting, along with the trained models.
See validate_fn()
for use
with custom model functions.
Usage
validate(
train_data,
formulas,
family,
test_data = NULL,
partitions_col = ".partitions",
control = NULL,
REML = FALSE,
cutoff = 0.5,
positive = 2,
metrics = list(),
preprocessing = NULL,
err_nc = FALSE,
rm_nc = FALSE,
parallel = FALSE,
verbose = FALSE,
link = deprecated(),
models = deprecated(),
model_verbose = deprecated()
)
Arguments
train_data |
Can contain a grouping factor for identifying partitions - as made with
| |||||||||||
formulas |
Model formulas as strings. (Character) E.g. Can contain random effects. E.g. | |||||||||||
family |
Name of the family. (Character) Currently supports See | |||||||||||
test_data |
| |||||||||||
partitions_col |
Name of grouping factor for identifying partitions. (Character) Rows with the value N.B. Only used if | |||||||||||
control |
Construct control structures for mixed model fitting
(with | |||||||||||
REML |
Restricted Maximum Likelihood. (Logical) | |||||||||||
cutoff |
Threshold for predicted classes. (Numeric) N.B. Binomial models only | |||||||||||
positive |
Level from dependent variable to predict.
Either as character (preferable) or level index ( E.g. if we have the levels Note: For reproducibility, it's preferable to specify the name directly, as
different Used when calculating confusion matrix metrics and creating The N.B. Only affects evaluation metrics, not the model training or returned predictions. N.B. Binomial models only. | |||||||||||
metrics |
E.g. You can enable/disable all metrics at once by including
The Also accepts the string | |||||||||||
preprocessing |
Name of preprocessing to apply. Available preprocessings are:
The preprocessing parameters ( N.B. The preprocessings should not affect the results
to a noticeable degree, although | |||||||||||
err_nc |
Whether to raise an | |||||||||||
rm_nc |
Remove non-converged models from output. (Logical) | |||||||||||
parallel |
Whether to validate the list of models in parallel. (Logical) Remember to register a parallel backend first.
E.g. with | |||||||||||
verbose |
Whether to message process information like the number of model instances to fit and which model function was applied. (Logical) | |||||||||||
link , models , model_verbose |
Deprecated. |
Details
Packages used:
Models
Gaussian: stats::lm
, lme4::lmer
Binomial: stats::glm
, lme4::glmer
Results
Shared
AIC
: stats::AIC
AICc
: MuMIn::AICc
BIC
: stats::BIC
Gaussian
r2m
: MuMIn::r.squaredGLMM
r2c
: MuMIn::r.squaredGLMM
Binomial
ROC and AUC
: pROC::roc
Value
tibble
with the results and model objects.
Shared across families
A nested tibble
with coefficients of the models from all iterations.
Count of convergence warnings. Consider discarding models that did not converge.
Count of other warnings. These are warnings without keywords such as "convergence".
Count of Singular Fit messages. See
lme4::isSingular
for more information.
Nested tibble
with the warnings and messages caught for each model.
Specified family.
Nested model objects.
Name of dependent variable.
Names of fixed effects.
Names of random effects, if any.
Nested tibble
with preprocessing parameters, if any.
—————————————————————-
Gaussian Results
—————————————————————-
RMSE
, MAE
, NRMSE(IQR)
,
RRSE
, RAE
, RMSLE
,
AIC
, AICc
, and BIC
.
See the additional metrics (disabled by default) at ?gaussian_metrics
.
A nested tibble
with the predictions and targets.
—————————————————————-
Binomial Results
—————————————————————-
Based on predictions of the test set,
a confusion matrix and ROC
curve are used to get the following:
ROC
:
AUC
, Lower CI
, and Upper CI
.
Confusion Matrix
:
Balanced Accuracy
,
F1
,
Sensitivity
,
Specificity
,
Positive Predictive Value
,
Negative Predictive Value
,
Kappa
,
Detection Rate
,
Detection Prevalence
,
Prevalence
, and
MCC
(Matthews correlation coefficient).
See the additional metrics (disabled by default) at
?binomial_metrics
.
Also includes:
A nested tibble
with predictions, predicted classes (depends on cutoff
), and the targets.
Note, that the predictions are not necessarily of the specified positive
class, but of
the model's positive class (second level of dependent variable, alphabetically).
The pROC::roc
ROC
curve object(s).
A nested tibble
with the confusion matrix/matrices.
The Pos_
columns tells you whether a row is a
True Positive (TP
), True Negative (TN
),
False Positive (FP
), or False Negative (FN
),
depending on which level is the "positive" class. I.e. the level you wish to predict.
The name of the Positive Class.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other validation functions:
cross_validate()
,
cross_validate_fn()
,
validate_fn()
Examples
# Attach packages
library(cvms)
library(groupdata2) # partition()
library(dplyr) # %>% arrange()
# Data is part of cvms
data <- participant.scores
# Set seed for reproducibility
set.seed(7)
# Partition data
# Keep as single data frame
# We could also have fed validate() separate train and test sets.
data_partitioned <- partition(
data,
p = 0.7,
cat_col = "diagnosis",
id_col = "participant",
list_out = FALSE
) %>%
arrange(.partitions)
# Validate a model
# Gaussian
validate(
data_partitioned,
formulas = "score~diagnosis",
partitions_col = ".partitions",
family = "gaussian",
REML = FALSE
)
# Binomial
validate(data_partitioned,
formulas = "diagnosis~score",
partitions_col = ".partitions",
family = "binomial"
)
## Feed separate train and test sets
# Partition data to list of data frames
# The first data frame will be train (70% of the data)
# The second will be test (30% of the data)
data_partitioned <- partition(
data,
p = 0.7,
cat_col = "diagnosis",
id_col = "participant",
list_out = TRUE
)
train_data <- data_partitioned[[1]]
test_data <- data_partitioned[[2]]
# Validate a model
# Gaussian
validate(
train_data,
test_data = test_data,
formulas = "score~diagnosis",
family = "gaussian",
REML = FALSE
)
Validate a custom model function on a test set
Description
Fit your model function on a training set and validate it by
predicting a test/validation set.
Validate different hyperparameter combinations and formulas at once.
Preprocess the train/test split.
Returns results and fitted models in a tibble
for easy reporting and further analysis.
Compared to validate()
,
this function allows you supply a custom model function, a predict function,
a preprocess function and the hyperparameter values to validate.
Supports regression and classification (binary and multiclass).
See `type`
.
Note that some metrics may not be computable for some types of model objects.
Usage
validate_fn(
train_data,
formulas,
type,
model_fn,
predict_fn,
test_data = NULL,
preprocess_fn = NULL,
preprocess_once = FALSE,
hyperparameters = NULL,
partitions_col = ".partitions",
cutoff = 0.5,
positive = 2,
metrics = list(),
rm_nc = FALSE,
parallel = FALSE,
verbose = TRUE
)
Arguments
train_data |
Can contain a grouping factor for identifying partitions - as made with
| |||||||||||||||
formulas |
Model formulas as strings. (Character) Will be converted to E.g. Can contain random effects. E.g. | |||||||||||||||
type |
Type of evaluation to perform:
| |||||||||||||||
model_fn |
Model function that returns a fitted model object.
Will usually wrap an existing model function like Must have the following function arguments:
| |||||||||||||||
predict_fn |
Function for predicting the targets in the test folds/sets using the fitted model object.
Will usually wrap Must have the following function arguments:
Must return predictions in the following formats, depending on Binomial
N.B. When unsure whether a model type produces probabilities based off the alphabetic order of your classes, using 0 and 1 as classes in the dependent variable instead of the class names should increase the chance of getting probabilities of the right class. Gaussian
Multinomial
| |||||||||||||||
test_data |
| |||||||||||||||
preprocess_fn |
Function for preprocessing the training and test sets. Can, for instance, be used to standardize both the training and test sets with the scaling and centering parameters from the training set. Must have the following function arguments:
Must return a
Additional elements in the returned The optional parameters
N.B. When | |||||||||||||||
preprocess_once |
Whether to apply the preprocessing once
(ignoring the formula and hyperparameters arguments in When preprocessing does not depend on the current formula or hyperparameters, we can do the preprocessing of each train/test split once, to save time. This may require holding a lot more data in memory though, why it is not the default setting. | |||||||||||||||
hyperparameters |
Either a Named list for grid searchAdd E.g.
|
lrn_rate | h_layers | drop_out |
0.1 | 10 | 0.65 |
0.1 | 1000 | 0.65 |
0.01 | 1000 | 0.63 |
... | ... | ... |
partitions_col
Name of grouping factor for identifying partitions. (Character)
Rows with the value 1
in `partitions_col`
are used as training set and
rows with the value 2
are used as test set.
N.B. Only used if `test_data`
is NULL
.
cutoff
Threshold for predicted classes. (Numeric)
N.B. Binomial models only
positive
Level from dependent variable to predict.
Either as character (preferable) or level index (1
or 2
- alphabetically).
E.g. if we have the levels "cat"
and "dog"
and we want "dog"
to be the positive class,
we can either provide "dog"
or 2
, as alphabetically, "dog"
comes after "cat"
.
Note: For reproducibility, it's preferable to specify the name directly, as
different locales
may sort the levels differently.
Used when calculating confusion matrix metrics and creating ROC
curves.
The Process
column in the output can be used to verify this setting.
N.B. Only affects evaluation metrics, not the model training or returned predictions.
N.B. Binomial models only.
metrics
list
for enabling/disabling metrics.
E.g. list("RMSE" = FALSE)
would remove RMSE
from the regression results,
and list("Accuracy" = TRUE)
would add the regular Accuracy
metric
to the classification results.
Default values (TRUE
/FALSE
) will be used for the remaining available metrics.
You can enable/disable all metrics at once by including
"all" = TRUE/FALSE
in the list
. This is done prior to enabling/disabling
individual metrics, why f.i. list("all" = FALSE, "RMSE" = TRUE)
would return only the RMSE
metric.
The list
can be created with
gaussian_metrics()
,
binomial_metrics()
, or
multinomial_metrics()
.
Also accepts the string "all"
.
rm_nc
Remove non-converged models from output. (Logical)
parallel
Whether to cross-validate the list
of models in parallel. (Logical)
Remember to register a parallel backend first.
E.g. with doParallel::registerDoParallel
.
verbose
Whether to message process information like the number of model instances to fit. (Logical)
Details
Packages used:
Results
Shared
AIC : stats::AIC
AICc : MuMIn::AICc
BIC : stats::BIC
Gaussian
r2m : MuMIn::r.squaredGLMM
r2c : MuMIn::r.squaredGLMM
Binomial and Multinomial
ROC and related metrics:
Binomial: pROC::roc
Multinomial: pROC::multiclass.roc
Value
tibble
with the results and model objects.
Shared across families
A nested tibble
with coefficients of the models. The coefficients
are extracted from the model object with parameters::model_parameters()
or
coef()
(with some restrictions on the output).
If these attempts fail, a default coefficients tibble
filled with NA
s is returned.
Nested tibble
with the used preprocessing parameters,
if a passed `preprocess_fn`
returns the parameters in a tibble
.
Count of convergence warnings, using a limited set of keywords (e.g. "convergence"). If a convergence warning does not contain one of these keywords, it will be counted with other warnings. Consider discarding models that did not converge on all iterations. Note: you might still see results, but these should be taken with a grain of salt!
Nested tibble
with the warnings and messages caught for each model.
Specified family.
Nested model objects.
Name of dependent variable.
Names of fixed effects.
Names of random effects, if any.
—————————————————————-
Gaussian Results
—————————————————————-
RMSE
, MAE
, NRMSE(IQR)
,
RRSE
, RAE
, and RMSLE
.
See the additional metrics (disabled by default) at ?gaussian_metrics
.
A nested tibble
with the predictions and targets.
—————————————————————-
Binomial Results
—————————————————————-
Based on predictions of the test set,
a confusion matrix and a ROC
curve are created to get the following:
ROC
:
AUC
, Lower CI
, and Upper CI
Confusion Matrix
:
Balanced Accuracy
,
F1
,
Sensitivity
,
Specificity
,
Positive Predictive Value
,
Negative Predictive Value
,
Kappa
,
Detection Rate
,
Detection Prevalence
,
Prevalence
, and
MCC
(Matthews correlation coefficient).
See the additional metrics (disabled by default) at
?binomial_metrics
.
Also includes:
A nested tibble
with predictions, predicted classes (depends on cutoff
), and the targets.
Note, that the predictions are not necessarily of the specified positive
class, but of
the model's positive class (second level of dependent variable, alphabetically).
The pROC::roc
ROC
curve object(s).
A nested tibble
with the confusion matrix/matrices.
The Pos_
columns tells you whether a row is a
True Positive (TP
), True Negative (TN
),
False Positive (FP
), or False Negative (FN
),
depending on which level is the "positive" class. I.e. the level you wish to predict.
The name of the Positive Class.
—————————————————————-
Multinomial Results
—————————————————————-
For each class, a one-vs-all binomial evaluation is performed. This creates
a Class Level Results tibble
containing the same metrics as the binomial results
described above (excluding MCC
, AUC
, Lower CI
and Upper CI
),
along with a count of the class in the target column (Support
).
These metrics are used to calculate the macro-averaged metrics.
The nested class level results tibble
is also included in the output tibble
,
and could be reported along with the macro and overall metrics.
The output tibble
contains the macro and overall metrics.
The metrics that share their name with the metrics in the nested
class level results tibble
are averages of those metrics
(note: does not remove NA
s before averaging).
In addition to these, it also includes the Overall Accuracy
and
the multiclass MCC
.
Note: Balanced Accuracy
is the macro-averaged metric,
not the macro sensitivity as sometimes used!
Other available metrics (disabled by default, see metrics
):
Accuracy
,
multiclass AUC
,
Weighted Balanced Accuracy
,
Weighted Accuracy
,
Weighted F1
,
Weighted Sensitivity
,
Weighted Sensitivity
,
Weighted Specificity
,
Weighted Pos Pred Value
,
Weighted Neg Pred Value
,
Weighted Kappa
,
Weighted Detection Rate
,
Weighted Detection Prevalence
, and
Weighted Prevalence
.
Note that the "Weighted" average metrics are weighted by the Support
.
Also includes:
A nested tibble
with the predictions, predicted classes, and targets.
A list of ROC curve objects when AUC
is enabled.
A nested tibble
with the multiclass Confusion Matrix.
Class Level Results
Besides the binomial evaluation metrics and the Support
,
the nested class level results tibble
also contains a
nested tibble
with the Confusion Matrix from the one-vs-all evaluation.
The Pos_
columns tells you whether a row is a
True Positive (TP
), True Negative (TN
),
False Positive (FP
), or False Negative (FN
),
depending on which level is the "positive" class. In our case, 1
is the current class
and 0
represents all the other classes together.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other validation functions:
cross_validate()
,
cross_validate_fn()
,
validate()
Examples
# Attach packages
library(cvms)
library(groupdata2) # fold()
library(dplyr) # %>% arrange() mutate()
# Note: More examples of custom functions can be found at:
# model_fn: model_functions()
# predict_fn: predict_functions()
# preprocess_fn: preprocess_functions()
# Data is part of cvms
data <- participant.scores
# Set seed for reproducibility
set.seed(7)
# Fold data
data <- partition(
data,
p = 0.8,
cat_col = "diagnosis",
id_col = "participant",
list_out = FALSE
) %>%
mutate(diagnosis = as.factor(diagnosis)) %>%
arrange(.partitions)
# Formulas to validate
formula_gaussian <- "score ~ diagnosis"
formula_binomial <- "diagnosis ~ score"
#
# Gaussian
#
# Create model function that returns a fitted model object
lm_model_fn <- function(train_data, formula, hyperparameters) {
lm(formula = formula, data = train_data)
}
# Create predict function that returns the predictions
lm_predict_fn <- function(test_data, model, formula,
hyperparameters, train_data) {
stats::predict(
object = model,
newdata = test_data,
type = "response",
allow.new.levels = TRUE
)
}
# Validate the model function
v <- validate_fn(
data,
formulas = formula_gaussian,
type = "gaussian",
model_fn = lm_model_fn,
predict_fn = lm_predict_fn,
partitions_col = ".partitions"
)
v
# Extract model object
v$Model[[1]]
#
# Binomial
#
# Create model function that returns a fitted model object
glm_model_fn <- function(train_data, formula, hyperparameters) {
glm(formula = formula, data = train_data, family = "binomial")
}
# Create predict function that returns the predictions
glm_predict_fn <- function(test_data, model, formula,
hyperparameters, train_data) {
stats::predict(
object = model,
newdata = test_data,
type = "response",
allow.new.levels = TRUE
)
}
# Validate the model function
validate_fn(
data,
formulas = formula_binomial,
type = "binomial",
model_fn = glm_model_fn,
predict_fn = glm_predict_fn,
partitions_col = ".partitions"
)
#
# Support Vector Machine (svm)
# with known hyperparameters
#
# Only run if the `e1071` package is installed
if (requireNamespace("e1071", quietly = TRUE)){
# Create model function that returns a fitted model object
# We use the hyperparameters arg to pass in the kernel and cost values
# These will usually have been found with cross_validate_fn()
svm_model_fn <- function(train_data, formula, hyperparameters) {
# Expected hyperparameters:
# - kernel
# - cost
if (!"kernel" %in% names(hyperparameters))
stop("'hyperparameters' must include 'kernel'")
if (!"cost" %in% names(hyperparameters))
stop("'hyperparameters' must include 'cost'")
e1071::svm(
formula = formula,
data = train_data,
kernel = hyperparameters[["kernel"]],
cost = hyperparameters[["cost"]],
scale = FALSE,
type = "C-classification",
probability = TRUE
)
}
# Create predict function that returns the predictions
svm_predict_fn <- function(test_data, model, formula,
hyperparameters, train_data) {
predictions <- stats::predict(
object = model,
newdata = test_data,
allow.new.levels = TRUE,
probability = TRUE
)
# Extract probabilities
probabilities <- dplyr::as_tibble(
attr(predictions, "probabilities")
)
# Return second column
probabilities[[2]]
}
# Specify hyperparameters to use
# We found these in the examples in ?cross_validate_fn()
svm_hparams <- list(
"kernel" = "linear",
"cost" = 10
)
# Validate the model function
validate_fn(
data,
formulas = formula_binomial,
type = "binomial",
model_fn = svm_model_fn,
predict_fn = svm_predict_fn,
hyperparameters = svm_hparams,
partitions_col = ".partitions"
)
} # closes `e1071` package check
Wine varieties
Description
A list of wine varieties in an approximately Zipfian distribution, ordered by descending frequencies.
Format
A data.frame
with 368
rows and 1
variable:
- Variety
Wine variety, 10 levels
Details
Based on the wine-reviews (v4) kaggle dataset by Zack Thoutt: https://www.kaggle.com/zynicide/wine-reviews
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk