Measures
Helper functions
MLJBase.measures
— Methodmeasures()
List all measures as named-tuples keyed on measure traits.
measures(conditions...)
List all measures satisifying the specified conditions
. A condition is any Bool
-valued function on the named-tuples.
Example
Find all classification measures supporting sample weights:
measures(m -> m.target_scitype <: AbstractVector{<:Finite} &&
m.supports_weights)
MLJBase.metadata_measure
— Methodmetadata_measure(T; kw...)
Helper function to write the metadata for a single measure.
Continuous loss functions
MLJBase.l1
— Constantl1(ŷ, y)
l1(ŷ, y, w)
L1 per-observation loss.
For more information, run info(l1)
.
MLJBase.l2
— Constantl2(ŷ, y)
l2(ŷ, y, w)
L2 per-observation loss.
For more information, run info(l2)
.
MLJBase.mae
— Constantmae(ŷ, y)
mae(ŷ, y, w)
Mean absolute error.
$\text{MAE} = n^{-1}∑ᵢ|yᵢ-ŷᵢ|$ or $\text{MAE} = n^{-1}∑ᵢwᵢ|yᵢ-ŷᵢ|$
For more information, run info(mae)
.
MLJBase.mape
— Constant MAPE(; tol=esp())
Mean Absolute Proportional Error:
$\text{MAPE} = m^{-1}∑ᵢ|{(yᵢ-ŷᵢ) \over yᵢ}|$ where the sum is over indices such that yᵢ > tol
and m
is the number of such indices.
For more information, run info(mape)
.
MLJBase.rms
— Constantrms(ŷ, y)
rms(ŷ, y, w)
Root mean squared error:
$\text{RMS} = \sqrt{n^{-1}∑ᵢ|yᵢ-ŷᵢ|^2}$ or $\text{RMS} = \sqrt{\frac{∑ᵢwᵢ|yᵢ-ŷᵢ|^2}{∑ᵢwᵢ}}$
For more information, run info(rms)
.
MLJBase.rmsl
— Constantrmsl(ŷ, y)
Root mean squared logarithmic error:
$\text{RMSL} = n^{-1}∑ᵢ\log\left({yᵢ \over ŷᵢ}\right)$
For more information, run info(rmsl)
.
See also rmslp1
.
MLJBase.rmslp1
— Constantrmslp1(ŷ, y)
Root mean squared logarithmic error with an offset of 1:
$\text{RMSLP1} = n^{-1}∑ᵢ\log\left({yᵢ + 1 \over ŷᵢ + 1}\right)$
For more information, run info(rmslp1)
.
See also rmsl
.
MLJBase.rmsp
— Constantrmsp(ŷ, y)
Root mean squared proportional loss:
$\text{RMSP} = m^{-1}∑ᵢ \left({yᵢ-ŷᵢ \over yᵢ}\right)^2$
where the sum is over indices such that yᵢ≂̸0
and m
is the number of such indices.
For more information, run info(rmsp)
.
Confusion matrix
MLJBase.confusion_matrix
— Methodconfusion_matrix(ŷ, y; rev=false)
Computes the confusion matrix given a predicted ŷ
with categorical elements and the actual y
. Rows are the predicted class, columns the ground truth. The ordering follows that of levels(y)
.
Keywords
rev=false
: in the binary case, this keyword allows to swap the ordering of classes.perm=[]
: in the general case, this keyword allows to specify a permutation re-ordering the classes.warn=true
: whether to show a warning in casey
does not have scientific typeOrderedFactor{2}
(see note below).
Note
To decrease the risk of unexpected errors, if y
does not have scientific type OrderedFactor{2}
(and so does not have a "natural ordering" negative-positive), a warning is shown indicating the current order unless the user explicitly specifies either rev
or perm
in which case it's assumed the user is aware of the class ordering.
The confusion_matrix
is a measure (although neither a score nor a loss) and so may be specified as such in calls to evaluate
, evaluate!
, although not in TunedModel
s. In this case, however, there no way to specify an ordering different from levels(y)
, where y
is the target.
MLJBase.ConfusionMatrix
— TypeConfusionMatrix{C}
Confusion matrix with C ≥ 2
classes. Rows correspond to predicted values and columns to the ground truth.
MLJBase.ConfusionMatrix
— MethodConfusionMatrix(m, labels)
Instantiates a confusion matrix out of a square integer matrix m
. Rows are the predicted class, columns the ground truth. See also the wikipedia article.
Finite loss functions
MLJBase.accuracy
— Constantaccuracy
Classification accuracy; aliases: accuracy
.
accuracy(ŷ, y)
accuracy(ŷ, y, w)
accuracy(conf_mat)
Returns the accuracy of the (point) predictions ŷ
, given true observations y
, optionally weighted by the weights w
. All three arguments must be abstract vectors of the same length. This metric is invariant to class labelling and can be used for multiclass classification.
For more information, run info(accuracy)
.
MLJBase.area_under_curve
— Constantarea_under_curve
Area under the ROC curve; aliases: area_under_curve
, auc
area_under_curve(ŷ, y)
Return the area under the receiver operator characteristic (curve), for probabilistic predictions ŷ
, given ground truth y
. This metric is invariant to class labelling and can be used only for binary classification.
For more information, run info(area_under_curve)
.
MLJBase.balanced_accuracy
— Constantbalanced_accuracy
Balanced classification accuracy; aliases: balanced_accuracy
, bacc
, bac
.
balanced_accuracy(ŷ, y [, w])
balanced_accuracy(conf_mat)
Return the balanced accuracy of the point prediction ŷ
, given true observations y
, optionally weighted by w
. The balanced accuracy takes into consideration class imbalance. All three arguments must have the same length. This metric is invariant to class labelling and can be used for multiclass classification.
For more information, run info(balanced_accuracy)
.
MLJBase.cross_entropy
— Constantcross_entropy
Cross entropy loss with probabilities clamped between eps()
and 1-eps()
; aliases: cross_entropy
.
ce = CrossEntropy(; eps=eps())
ce(ŷ, y)
Given an abstract vector of distributions ŷ
and an abstract vector of true observations y
, return the corresponding cross-entropy loss (aka log loss) scores.
Since the score is undefined in the case of the true observation has predicted probability zero, probablities are clipped between eps
and 1-eps
where eps
can be specified.
If sᵢ
is the predicted probability for the true class yᵢ
then the score for that example is given by
-log(clamp(sᵢ, eps, 1-eps))
For more information, run info(cross_entropy)
.
MLJBase.false_discovery_rate
— Constantfalse_discovery_rate
false discovery rate; aliases: false_discovery_rate
, falsediscovery_rate
, fdr
.
false_discovery_rate(ŷ, y)
False discovery rate for observations ŷ
and ground truth y
. Assigns false
to first element of levels(y)
. To reverse roles, use FalseDiscoveryRate(rev=true)
instead of false_discovery_rate
.
For more information, run info(false_discovery_rate)
.
MLJBase.false_negative
— Constantfalse_negative
Number of false negatives; aliases: false_negative
, falsenegative
.
false_negative(ŷ, y)
Number of false positives for observations ŷ
and ground truth y
. Assigns false
to first element of levels(y)
. To reverse roles, use FalseNegative(rev=true)
instead of false_negative
.
For more information, run info(false_negative)
.
MLJBase.false_negative_rate
— Constantfalse_negative_rate
false negative rate; aliases: false_negative_rate
, falsenegative_rate
, fnr
, miss_rate
.
false_negative_rate(ŷ, y)
False negative rate for observations ŷ
and ground truth y
. Assigns false
to first element of levels(y)
. To reverse roles, use FalseNegativeRate(rev=true)
instead of false_negative_rate
.
For more information, run info(false_negative_rate)
.
MLJBase.false_positive
— Constantfalse_positive
Number of false positives; aliases: false_positive
, falsepositive
.
false_positive(ŷ, y)
Number of false positives for observations ŷ
and ground truth y
. Assigns false
to first element of levels(y)
. To reverse roles, use FalsePositive(rev=true)
instead of false_positive
.
For more information, run info(false_positive)
.
MLJBase.false_positive_rate
— Constantfalse_positive_rate
false positive rate; aliases: false_positive_rate
, falsepositive_rate
, fpr
, fallout
.
false_positive_rate(ŷ, y)
False positive rate for observations ŷ
and ground truth y
. Assigns false
to first element of levels(y)
. To reverse roles, use FalsePositiveRate(rev=true)
instead of false_positive_rate
.
For more information, run info(false_positive_rate)
.
MLJBase.matthews_correlation
— Constantmatthews_correlation
Matthew's correlation; aliases: matthews_correlation
, mcc
matthews_correlation(ŷ, y)
matthews_correlation(conf_mat)
Return Matthews' correlation coefficient corresponding to the point prediction ŷ
, given true observations y
. This metric is invariant to class labelling and can be used for multiclass classification.
For more information, run info(matthews_correlation)
.
MLJBase.misclassification_rate
— Constantmisclassification_rate
misclassification rate; aliases: misclassification_rate
, mcr
.
misclassification_rate(ŷ, y)
misclassification_rate(ŷ, y, w)
misclassification_rate(conf_mat)
Returns the rate of misclassification of the (point) predictions ŷ
, given true observations y
, optionally weighted by the weights w
. All three arguments must be abstract vectors of the same length. A confusion matrix can also be passed as argument. This metric is invariant to class labelling and can be used for multiclass classification.
For more information, run info(misclassification_rate)
.
MLJBase.negative_predictive_value
— Constantnegative_predictive_value
negative predictive value; aliases: negative_predictive_value
, negativepredictive_value
, npv
.
negative_predictive_value(ŷ, y)
Negative predictive value for observations ŷ
and ground truth y
. Assigns false
to first element of levels(y)
. To reverse roles, use NPV(rev=true)
instead of negative_predictive_value
.
For more information, run info(negative_predictive_value)
.
MLJBase.positive_predictive_value
— Constantpositive_predictive_value
positive predictive value (aka precision); aliases: positive_predictive_value
, ppv
, Precision()
, positivepredictive_value
.
positive_predictive_value(ŷ, y)
Positive predictive value for observations ŷ
and ground truth y
. Assigns false
to first element of levels(y)
. To reverse roles, use Precision(rev=true)
instead of positive_predictive_value
.
For more information, run info(positive_predictive_value)
.
MLJBase.true_negative
— Constanttrue_negative
Number of true negatives; aliases: true_negative
, truenegative
.
true_negative(ŷ, y)
Number of true negatives for observations ŷ
and ground truth y
. Assigns false
to first element of levels(y)
. To reverse roles, use TrueNegative(rev=true)
instead of true_negative
.
For more information, run info(true_negative)
.
MLJBase.true_negative_rate
— Constanttrue_negative_rate
true negative rate; aliases: true_negative_rate
, truenegative_rate
, tnr
, specificity
, selectivity
.
true_negative_rate(ŷ, y)
True negative rate for observations ŷ
and ground truth y
. Assigns false
to first element of levels(y)
. To reverse roles, use TrueNegativeRate(rev=true)
instead of true_negative_rate
.
For more information, run info(true_negative_rate)
.
MLJBase.true_positive
— Constanttrue_positive
Number of true positives; aliases: true_positive
, truepositive
.
true_positive(ŷ, y)
Number of true positives for observations ŷ
and ground truth y
. Assigns false
to first element of levels(y)
. To reverse roles, use TruePositive(rev=true)
instead of true_positive
.
For more information, run info(true_positive)
.
MLJBase.true_positive_rate
— Constanttrue_positive_rate
True positive rate; aliases: true_positive_rate
, truepositive_rate
, tpr
, sensitivity
, recall
, hit_rate
.
true_positive_rate(ŷ, y)
True positive rate for observations ŷ
and ground truth y
. Assigns false
to first element of levels(y)
. To reverse roles, use TruePositiveRate(rev=true)
instead of true_positive_rate
.
For more information, run info(true_positive_rate)
.
MLJBase.BrierScore
— MethodBrierScore(; distribution=UnivariateFinite)(ŷ, y [, w])
Given an abstract vector of distributions ŷ
of type distribution
, and an abstract vector of true observations y
, return the corresponding Brier (aka quadratic) scores. Weight the scores using w
if provided.
Currently only distribution=UnivariateFinite
is supported, which is applicable to superivised models with Finite
target scitype. In this case, if p(y)
is the predicted probability for a single observation y
, and C
all possible classes, then the corresponding Brier score for that observation is given by
$2p(y) - \left(\sum_{η ∈ C} p(η)^2\right) - 1$
Note that BrierScore()=BrierScore{UnivariateFinite}
has the alias brier_score
.
Warning. Here BrierScore
is a "score" in the sense that bigger is better (with 0
optimal, and all other values negative). In Brier's original 1950 paper, and many other places, it has the opposite sign, despite the name. Moreover, the present implementation does not treat the binary case as special, so that the score may differ, in that case, by a factor of two from usage elsewhere.
For more information, run info(BrierScore)
.
MLJBase.FScore
— TypeFScore{β}(rev=nothing)
One-parameter generalization, $F_β$, of the F-measure or balanced F-score.
FScore{β}(ŷ, y)
Evaluate $F_β$ score on observations ,ŷ
, given ground truth values, y
.
By default, the second element of levels(y)
is designated as true
. To reverse roles, use FScore{β}(rev=true)
instead of FScore{β}
.
For more information, run info(FScore)
.
MLJBase.roc_curve
— Methodtprs, fprs, ts = roc_curve(ŷ, y) = roc(ŷ, y)
Return the ROC curve for a two-class probabilistic prediction ŷ
given the ground truth y
. The true positive rates, false positive rates over a range of thresholds ts
are returned. Note that if there are k
unique scores, there are correspondingly k
thresholds and k+1
"bins" over which the FPR and TPR are constant:
[0.0 - thresh[1]]
[thresh[1] - thresh[2]]
- ...
[thresh[k] - 1]
consequently, tprs
and fprs
are of length k+1
if ts
is of length k
.
To draw the curve using your favorite plotting backend, do plot(fprs, tprs)
.
MLJBase._idx_unique_sorted
— Method_idx_unique_sorted(v)
Internal function to return the index of unique elements in v
under the assumption that the vector v
is sorted in decreasing order.