The Measures

Scientific type of observations

Measures can be classified according to the scientific type of the target observations they consume (given by the value of the trait, StatisticalMeasuresBase.observation_scitype(measure)):

observation scitypemeaning
Finitegeneral classification
Finite{2}=Binarybinary classification
OrderedFactorclassification (class order matters)
OrderedFactor{2}binary classification (order matters)
Continuousregression
Infiniteregression, including integer targets for Count data
AbstractArray{T}multitarget version of T, some tabular data okay

Measures are not strict about data conforming to the declared observation scitype. For example, where OrderedFactor{2} is expected, Finite{2} will work, and in fact most eltypes will work, so long as there are only two classes. However, you may get warnings that mitigate possible misinterpretations of results (e.g., about which class is the "positive" one). Some warnings can be suppressed by explicitly specifying measure parameters, such as levels.

To be 100% safe and avoid warnings, use data with the recommended observation scitype.

On multi-target measures and tabular data

All multi-target measures below (the ones with AbstractArray observation scitypes) also handle some forms of tabular input, including DataFrames and Julia's native "row table" and "column table" formats. This is not reflected by the declared observation scitype. Instead, you can inspect the trait StatisticalMeasuresBase.can_consume_tables or consult the measure document string.

Classification measures (non-probabilistic)

constructor / instance aliasesobservation scitype
FScoreUnion{Missing, OrderedFactor{2}}
FalseDiscoveryRateUnion{Missing, OrderedFactor{2}}
FalseNegativeUnion{Missing, OrderedFactor{2}}
FalseNegativeRateUnion{Missing, OrderedFactor{2}}
FalsePositiveUnion{Missing, OrderedFactor{2}}
FalsePositiveRateUnion{Missing, OrderedFactor{2}}
NegativePredictiveValueUnion{Missing, OrderedFactor{2}}
PositivePredictiveValueUnion{Missing, OrderedFactor{2}}
TrueNegativeUnion{Missing, OrderedFactor{2}}
TrueNegativeRateUnion{Missing, OrderedFactor{2}}
TruePositiveUnion{Missing, OrderedFactor{2}}
TruePositiveRateUnion{Missing, OrderedFactor{2}}
AccuracyUnion{Missing, Finite}
BalancedAccuracyUnion{Missing, Finite}
ConfusionMatrixUnion{Missing, Finite}
KappaUnion{Missing, Finite}
MatthewsCorrelationUnion{Missing, Finite}
MisclassificationRateUnion{Missing, Finite}
MulticlassFScoreUnion{Missing, Finite}
MulticlassFalseDiscoveryRateUnion{Missing, Finite}
MulticlassFalseNegativeUnion{Missing, Finite}
MulticlassFalseNegativeRateUnion{Missing, Finite}
MulticlassFalsePositiveUnion{Missing, Finite}
MulticlassFalsePositiveRateUnion{Missing, Finite}
MulticlassNegativePredictiveValueUnion{Missing, Finite}
MulticlassPositivePredictiveValueUnion{Missing, Finite}
MulticlassTrueNegativeUnion{Missing, Finite}
MulticlassTrueNegativeRateUnion{Missing, Finite}
MulticlassTruePositiveUnion{Missing, Finite}
MulticlassTruePositiveRateUnion{Missing, Finite}
MultitargetAccuracyAbstractArray{<:Union{Missing, Finite}}
MultitargetMisclassificationRateAbstractArray{<:Union{Missing, Finite}}

Regression measures (non-probabilistic)

constructor / instance aliasesobservation scitype
LPLossUnion{Missing, Infinite}
LPSumLossUnion{Missing, Infinite}
LogCoshLossUnion{Missing, Infinite}
MeanAbsoluteProportionalErrorUnion{Missing, Infinite}
RSquaredUnion{Missing, Infinite}
RootMeanSquaredErrorUnion{Missing, Infinite}
RootMeanSquaredLogErrorUnion{Missing, Infinite}
RootMeanSquaredLogProportionalErrorUnion{Missing, Infinite}
RootMeanSquaredProportionalErrorUnion{Missing, Infinite}
MultitargetLPLossAbstractArray{<:Union{Missing, Infinite}}
MultitargetLPSumLossAbstractArray{<:Union{Missing, Infinite}}
MultitargetLogCoshLossAbstractArray{<:Union{Missing, Infinite}}
MultitargetMeanAbsoluteProportionalErrorAbstractArray{<:Union{Missing, Infinite}}
MultitargetRootMeanSquaredErrorAbstractArray{<:Union{Missing, Infinite}}
MultitargetRootMeanSquaredLogErrorAbstractArray{<:Union{Missing, Infinite}}
MultitargetRootMeanSquaredLogProportionalErrorAbstractArray{<:Union{Missing, Infinite}}
MultitargetRootMeanSquaredProportionalErrorAbstractArray{<:Union{Missing, Infinite}}

Probabilistic measures

These are measures where each prediction is a probability mass or density function, over the space of possible ground truth observations. Specifically, StatisticalMeasuresBase.kind_of_proxy(measure) == LearnAPI.Distribution().

constructor / instance aliasesobservation scitype
BrierLossUnion{Missing, Infinite, Finite}
BrierScoreUnion{Missing, Infinite, Finite}
LogLossUnion{Missing, Infinite, Finite}
LogScoreUnion{Missing, Infinite, Finite}
SphericalScoreUnion{Missing, Infinite, Finite}
AreaUnderCurveBinary

List of aliases

Some of the measures constructed using specific parameter values have pre-defined names associated with them that are exported by StatisticalMeasures.jl These are called aliases.

aliasconstructed with
accuracyAccuracy
area_under_curveAreaUnderCurve
aucAreaUnderCurve
bacBalancedAccuracy
baccBalancedAccuracy
balanced_accuracyBalancedAccuracy
brier_lossBrierLoss
brier_scoreBrierScore
confmatConfusionMatrix
confusion_matrixConfusionMatrix
cross_entropyLogLoss
cross_entropyBrierLoss
f1scoreFScore
falloutFalsePositiveRate
false_discovery_rateFalseDiscoveryRate
false_negative_rateFalseNegativeRate
false_negativeFalseNegative
false_positive_rateFalsePositiveRate
false_positiveFalsePositive
falsediscovery_rateFalseDiscoveryRate
falsenegative_rateFalseNegativeRate
falsenegativeFalseNegative
falsepositive_rateFalsePositiveRate
falsepositiveFalsePositive
fdrFalseDiscoveryRate
fnrFalseNegativeRate
fprFalsePositiveRate
hit_rateTruePositiveRate
kappaKappa
l1_sumLPSumLoss
l1LPLoss
l2_sumLPSumLoss
l2LPLoss
log_cosh_lossLogCoshLoss
log_coshLogCoshLoss
log_lossLogLoss
log_scoreLogScore
macro_f1scoreMulticlassFScore
maeLPLoss
mapeMeanAbsoluteProportionalError
matthews_correlationMatthewsCorrelation
mavLPLoss
mccMatthewsCorrelation
mcrMisclassificationRate
mean_absolute_errorLPLoss
mean_absolute_valueLPLoss
micro_f1scoreMulticlassFScore
misclassification_rateMisclassificationRate
miss_rateFalseNegativeRate
multiclass_f1scoreMulticlassFScore
multiclass_falloutMulticlassFalsePositiveRate
multiclass_false_discovery_rateMulticlassFalseDiscoveryRate
multiclass_false_negative_rateMulticlassFalseNegativeRate
multiclass_false_negativeMulticlassFalseNegative
multiclass_false_positive_rateMulticlassFalsePositiveRate
multiclass_false_positiveMulticlassFalsePositive
multiclass_falsediscovery_rateMulticlassFalseDiscoveryRate
multiclass_falsenegative_rateMulticlassFalseNegativeRate
multiclass_falsenegativeMulticlassFalseNegative
multiclass_falsepositive_rateMulticlassFalsePositiveRate
multiclass_falsepositiveMulticlassFalsePositive
multiclass_fdrMulticlassFalseDiscoveryRate
multiclass_fnrMulticlassFalseNegativeRate
multiclass_fprMulticlassFalsePositiveRate
multiclass_hit_rateMulticlassTruePositiveRate
multiclass_miss_rateMulticlassFalseNegativeRate
multiclass_negative_predictive_valueMulticlassNegativePredictiveValue
multiclass_negativepredictive_valueMulticlassNegativePredictiveValue
multiclass_npvMulticlassNegativePredictiveValue
multiclass_positive_predictive_valueMulticlassPositivePredictiveValue
multiclass_positivepredictive_valueMulticlassPositivePredictiveValue
multiclass_ppvMulticlassPositivePredictiveValue
multiclass_precisionMulticlassPositivePredictiveValue
multiclass_recallMulticlassTruePositiveRate
multiclass_selectivityMulticlassTrueNegativeRate
multiclass_sensitivityMulticlassTruePositiveRate
multiclass_specificityMulticlassTrueNegativeRate
multiclass_tnrMulticlassTrueNegativeRate
multiclass_tprMulticlassTruePositiveRate
multiclass_true_negative_rateMulticlassTrueNegativeRate
multiclass_true_negativeMulticlassTrueNegative
multiclass_true_positive_rateMulticlassTruePositiveRate
multiclass_true_positiveMulticlassTruePositive
multiclass_truenegative_rateMulticlassTrueNegativeRate
multiclass_truenegativeMulticlassTrueNegative
multiclass_truepositive_rateMulticlassTruePositiveRate
multiclass_truepositiveMulticlassTruePositive
multitarget_accuracyMultitargetAccuracy
multitarget_l1_sumMultitargetLPSumLoss
multitarget_l1MultitargetLPLoss
multitarget_l2_sumMultitargetLPSumLoss
multitarget_l2MultitargetLPLoss
multitarget_maeMultitargetLPLoss
multitarget_mapeMultitargetMeanAbsoluteProportionalError
multitarget_mapeMultitargetLogCoshLoss
multitarget_mavMultitargetLPLoss
multitarget_mcrMultitargetMisclassificationRate
multitarget_mean_absolute_errorMultitargetLPLoss
multitarget_mean_absolute_valueMultitargetLPLoss
multitarget_misclassification_rateMultitargetMisclassificationRate
multitarget_rmsMultitargetRootMeanSquaredError
multitarget_rmseMultitargetRootMeanSquaredError
multitarget_rmslMultitargetRootMeanSquaredLogError
multitarget_rmsleMultitargetRootMeanSquaredLogError
multitarget_rmslp1MultitargetRootMeanSquaredLogProportionalError
multitarget_rmspMultitargetRootMeanSquaredProportionalError
multitarget_root_mean_squared_errorMultitargetRootMeanSquaredError
multitarget_root_mean_squared_log_errorMultitargetRootMeanSquaredLogError
negative_predictive_valueNegativePredictiveValue
negativepredictive_valueNegativePredictiveValue
npvNegativePredictiveValue
positive_predictive_valuePositivePredictiveValue
positivepredictive_valuePositivePredictiveValue
ppvPositivePredictiveValue
precisionPositivePredictiveValue
probability_of_correct_classificationBalancedAccuracy
quadratic_lossBrierLoss
quadratic_scoreBrierScore
recallTruePositiveRate
rmsRootMeanSquaredError
rmseRootMeanSquaredError
rmslRootMeanSquaredLogError
rmsleRootMeanSquaredLogError
rmslp1RootMeanSquaredLogProportionalError
rmspRootMeanSquaredProportionalError
root_mean_squared_errorRootMeanSquaredError
root_mean_squared_log_errorRootMeanSquaredLogError
rsqRSquared
rsquaredRSquared
selectivityTrueNegativeRate
sensitivityTruePositiveRate
specificityTrueNegativeRate
spherical_scoreSphericalScore
tnrTrueNegativeRate
tprTruePositiveRate
true_negative_rateTrueNegativeRate
true_negativeTrueNegative
true_positive_rateTruePositiveRate
true_positiveTruePositive
truenegative_rateTrueNegativeRate
truenegativeTrueNegative
truepositive_rateTruePositiveRate
truepositiveTruePositive

Reference

StatisticalMeasures.LPLossFunction
LPLoss(; p=2)

Return a callable measure for computing the $L^p$ loss. Aliases: l1, l2, mae, mav, mean_absolute_error, mean_absolute_value.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the LPLoss constructor (e.g., m = LPLoss()) on predictions , given ground truth observations y. Specifically, return the mean of $|ŷ_i - y_i|^p$ over all pairs of observations $(ŷ_i, y_i)$ in (ŷ, y), or more generally, the mean of weighted versions of those values. For the weighted sum use LPSumLoss instead.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = ``L^p`` loss
source
StatisticalMeasures.MultitargetLPLossFunction
MultitargetLPLoss(; p=2, atomic_weights=nothing)

Return a callable measure for computing the multitarget $L^p$ loss. Aliases: multitarget_l1, multitarget_l2, multitarget_mae, multitarget_mav, multitarget_mean_absolute_error, multitarget_mean_absolute_value.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the MultitargetLPLoss constructor (e.g., m = MultitargetLPLoss()) on predictions , given ground truth observations y. Specifically, compute the multi-target version of LPLoss. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The atomic_weights are weights for each component of the multi-target. Unless equal to nothing (uniform weights) the length of atomic_weights will generally match the number of columns of y, if y is a table, or the number of rows of y, if y is a matrix.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:AbstractArray{<:Union{Missing,Infinite}}. Alternatively, y and can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multitarget ``L^p`` loss
source
StatisticalMeasures.LPSumLossFunction
LPSumLoss(; p=2)

Return a callable measure for computing the $L^p$ sum loss. Aliases: l1_sum, l2_sum.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the LPSumLoss constructor (e.g., m = LPSumLoss()) on predictions , given ground truth observations y. Specifically, compute the (weighted) sum of $|ŷ_i - yᵢ|^p$ over all pairs of observations $(ŷ_i, yᵢ)$ in (ŷ, y). For the weighted mean use LPLoss instead.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = ``L^p`` sum loss
source
StatisticalMeasures.MultitargetLPSumLossFunction
MultitargetLPSumLoss(; p=2, atomic_weights=nothing)

Return a callable measure for computing the multitarget $L^p$ sum loss. Aliases: multitarget_l1_sum, multitarget_l2_sum.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the MultitargetLPSumLoss constructor (e.g., m = MultitargetLPSumLoss()) on predictions , given ground truth observations y. Specifically, compute the multi-target version of LPSumLoss. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The atomic_weights are weights for each component of the multi-target. Unless equal to nothing (uniform weights) the length of atomic_weights will generally match the number of columns of y, if y is a table, or the number of rows of y, if y is a matrix.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:AbstractArray{<:Union{Missing,Infinite}}. Alternatively, y and can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = multitarget ``L^p`` sum loss
source
StatisticalMeasures.RootMeanSquaredErrorFunction
RootMeanSquaredError()

Return a callable measure for computing the root mean squared error. Aliases: rms, rmse, root_mean_squared_error.

RootMeanSquaredError()(ŷ, y)
RootMeanSquaredError()(ŷ, y, weights)
RootMeanSquaredError()(ŷ, y, class_weights::AbstractDict)
RootMeanSquaredError()(ŷ, y, weights, class_weights::AbstractDict)

Evaluate RootMeanSquaredError() on predictions , given ground truth observations y. Specifically, compute the mean of $|y_i-ŷ_i|^2$ over all pairs of observations $(ŷ_i, y_i)$ in (ŷ, y), and return the square root of the result. More generally, pre-multiply the squared deviations by the specified weights.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(RootMeanSquaredError(), ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = root mean squared error
source
StatisticalMeasures.MultitargetRootMeanSquaredErrorFunction
MultitargetRootMeanSquaredError(; atomic_weights=nothing)

Return a callable measure for computing the multitarget root mean squared error. Aliases: multitarget_rms, multitarget_rmse, multitarget_root_mean_squared_error.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the MultitargetRootMeanSquaredError constructor (e.g., m = MultitargetRootMeanSquaredError()) on predictions , given ground truth observations y. Specifically, compute the multi-target version of RootMeanSquaredError. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The atomic_weights are weights for each component of the multi-target. Unless equal to nothing (uniform weights) the length of atomic_weights will generally match the number of columns of y, if y is a table, or the number of rows of y, if y is a matrix.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:AbstractArray{<:Union{Missing,Infinite}}. Alternatively, y and can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = multitarget root mean squared error
source
StatisticalMeasures.RootMeanSquaredLogErrorFunction
RootMeanSquaredLogError()

Return a callable measure for computing the root mean squared log error. Aliases: rmsl, rmsle, root_mean_squared_log_error.

RootMeanSquaredLogError()(ŷ, y)
RootMeanSquaredLogError()(ŷ, y, weights)
RootMeanSquaredLogError()(ŷ, y, class_weights::AbstractDict)
RootMeanSquaredLogError()(ŷ, y, weights, class_weights::AbstractDict)

Evaluate RootMeanSquaredLogError() on predictions , given ground truth observations y. Specifically, return the mean of $(\log(y)_i - \log(ŷ_i))^2$ over all pairs of observations $(ŷ_i, y_i)$ in (ŷ, y), and return the square root of the result. More generally, pre-multiply the values averaged by the specified weights. To include an offset, use RootMeanSquaredLogProportionalError instead.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(RootMeanSquaredLogError(), ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = root mean squared log error
source
StatisticalMeasures.MultitargetRootMeanSquaredLogErrorFunction
MultitargetRootMeanSquaredLogError(; atomic_weights=nothing)

Return a callable measure for computing the multitarget root mean squared log error. Aliases: multitarget_rmsl, multitarget_rmsle, multitarget_root_mean_squared_log_error.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the MultitargetRootMeanSquaredLogError constructor (e.g., m = MultitargetRootMeanSquaredLogError()) on predictions , given ground truth observations y. Specifically, compute the multi-target version of RootMeanSquaredLogError. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The atomic_weights are weights for each component of the multi-target. Unless equal to nothing (uniform weights) the length of atomic_weights will generally match the number of columns of y, if y is a table, or the number of rows of y, if y is a matrix.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:AbstractArray{<:Union{Missing,Infinite}}. Alternatively, y and can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = multitarget root mean squared log error
source
StatisticalMeasures.RootMeanSquaredLogProportionalErrorFunction
RootMeanSquaredLogProportionalError(; offset=1)

Return a callable measure for computing the root mean squared log proportional error. Aliases: rmslp1.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the RootMeanSquaredLogProportionalError constructor (e.g., m = RootMeanSquaredLogProportionalError()) on predictions , given ground truth observations y. Specifically, compute the mean of $(\log(ŷ_i + δ) - \log(y_i + δ))^2$ over all pairs of observations $(ŷ_i, y_i)$ in (ŷ, y), and return the square root. More generally, pre-multiply the values averaged by the specified weights. Here $δ$=offset, which is 1 by default. This is the same as RootMeanSquaredLogError but adds an offset.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = root mean squared log proportional error
source
StatisticalMeasures.MultitargetRootMeanSquaredLogProportionalErrorFunction
MultitargetRootMeanSquaredLogProportionalError(; offset=1, atomic_weights=nothing)

Return a callable measure for computing the multitarget root mean squared log proportional error. Aliases: multitarget_rmslp1.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the MultitargetRootMeanSquaredLogProportionalError constructor (e.g., m = MultitargetRootMeanSquaredLogProportionalError()) on predictions , given ground truth observations y. Specifically, compute the multi-target version of RootMeanSquaredLogProportionalError. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The atomic_weights are weights for each component of the multi-target. Unless equal to nothing (uniform weights) the length of atomic_weights will generally match the number of columns of y, if y is a table, or the number of rows of y, if y is a matrix.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}. Alternatively, y and can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = multitarget root mean squared log proportional error
source
StatisticalMeasures.RootMeanSquaredProportionalErrorFunction
RootMeanSquaredProportionalError(; tol=eps())

Return a callable measure for computing the root mean squared proportional error. Aliases: rmsp.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the RootMeanSquaredProportionalError constructor (e.g., m = RootMeanSquaredProportionalError()) on predictions , given ground truth observations y. Specifically, compute the mean of ((ŷᵢ-yᵢ)/yᵢ)^2} over all pairs of observations (ŷᵢ, yᵢ) in (ŷ, y), and return the square root of the result. More generally, pre-multiply the values averaged by the specified weights. Terms for which abs(yᵢ) < tol are dropped in the summation, but counts still contribute to the mean normalization factor.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = root mean squared proportional error
source
StatisticalMeasures.MultitargetRootMeanSquaredProportionalErrorFunction
MultitargetRootMeanSquaredProportionalError(; tol=eps(), atomic_weights=nothing)

Return a callable measure for computing the multitarget root mean squared proportional error. Aliases: multitarget_rmsp.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the MultitargetRootMeanSquaredProportionalError constructor (e.g., m = MultitargetRootMeanSquaredProportionalError()) on predictions , given ground truth observations y. Specifically, compute the multi-target version of RootMeanSquaredProportionalError. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The atomic_weights are weights for each component of the multi-target. Unless equal to nothing (uniform weights) the length of atomic_weights will generally match the number of columns of y, if y is a table, or the number of rows of y, if y is a matrix.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}. Alternatively, y and can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = multitarget root mean squared proportional error
source
StatisticalMeasures.MeanAbsoluteProportionalErrorFunction
MeanAbsoluteProportionalError(; tol=eps())

Return a callable measure for computing the mean absolute proportional error. Aliases: mape.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the MeanAbsoluteProportionalError constructor (e.g., m = MeanAbsoluteProportionalError()) on predictions , given ground truth observations y. Specifically, return the mean of $|ŷ_i-y_i| \over |y_i|$ over all pairs of observations $(ŷ_i, y_i)$ in (ŷ, y). More generally, pre-multiply the values averaged by the specified weights. Terms for which $|y_i|$<tol are dropped in the summation, but corresponding weights (or counts) still contribute to the mean normalization factor.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = mean absolute proportional error
source
StatisticalMeasures.MultitargetMeanAbsoluteProportionalErrorFunction
MultitargetMeanAbsoluteProportionalError(; tol=eps(), atomic_weights=nothing)

Return a callable measure for computing the multitarget mean absolute proportional error. Aliases: multitarget_mape.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the MultitargetMeanAbsoluteProportionalError constructor (e.g., m = MultitargetMeanAbsoluteProportionalError()) on predictions , given ground truth observations y. Specifically, compute the multi-target version of MeanAbsoluteProportionalError. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The atomic_weights are weights for each component of the multi-target. Unless equal to nothing (uniform weights) the length of atomic_weights will generally match the number of columns of y, if y is a table, or the number of rows of y, if y is a matrix.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}. Alternatively, y and can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multitarget mean absolute proportional error
source
StatisticalMeasures.LogCoshLossFunction
LogCoshLoss()

Return a callable measure for computing the log cosh loss. Aliases: log_cosh, log_cosh_loss.

LogCoshLoss()(ŷ, y)
LogCoshLoss()(ŷ, y, weights)
LogCoshLoss()(ŷ, y, class_weights::AbstractDict)
LogCoshLoss()(ŷ, y, weights, class_weights::AbstractDict)

Evaluate LogCoshLoss() on predictions , given ground truth observations y. Return the mean of $\log(\cosh(ŷ_i-y_i))$ over all pairs of observations $(ŷ_i, y_i)$ in (ŷ, y). More generally, pre-multiply the values averaged by the specified weights.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(LogCoshLoss(), ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = log cosh loss
source
StatisticalMeasures.MultitargetLogCoshLossFunction
MultitargetLogCoshLoss(; atomic_weights=nothing)

Return a callable measure for computing the multitarget log cosh loss. Aliases: multitarget_mape.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the MultitargetLogCoshLoss constructor (e.g., m = MultitargetLogCoshLoss()) on predictions , given ground truth observations y. Specifically, compute the multi-target version of LogCoshLoss. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The atomic_weights are weights for each component of the multi-target. Unless equal to nothing (uniform weights) the length of atomic_weights will generally match the number of columns of y, if y is a table, or the number of rows of y, if y is a matrix.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}. Alternatively, y and can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multitarget log cosh loss
source
StatisticalMeasures.RSquaredFunction
RSquared()

Return a callable measure for computing the R² coefficient. Aliases: rsq, rsquared.

RSquared()(ŷ, y)

Evaluate RSquared() on predictions , given ground truth observations y. Specifically, return the value of

$1 - \frac{∑ᵢ (ŷ_i- y_i)^2}{∑ᵢ ȳ - y_i)^2},$

where $ȳ$ denote the mean of the $y_i$. Also known as R-squared or the coefficient of determination, the coefficients is suitable for interpreting linear regression analysis (Chicco et al., 2021).

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Infinite,Missing}.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = R² coefficient
source
StatisticalMeasures.ConfusionMatrixFunction
ConfusionMatrix(; levels=nothing, rev=false, perm=nothing, checks=true)

Return a callable measure for computing the confusion matrix. Aliases: confmat, confusion_matrix.

m(ŷ, y)

Evaluate some measure m returned by the ConfusionMatrix constructor (e.g., m = ConfusionMatrix()) on predictions , given ground truth observations y. See the Confusion matrix wikipedia article.

Elements of a confusion matrix can always be accessed by level - see the example below. To flag the confusion matrix as ordered, and hence index-accessible, do one of the following:

  • Supply ordered CategoricalArray inputs and y

  • Explicitly specify levels or one of rev, perm

Note that == for two confusion matrices is stricter when both are ordered.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Finite,Missing} (multiclass classification).

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

For more on the type of object returned and its interface, see ConfusionMatrices.ConfusionMatrix.

Example


using StatisticalMeasures

y = ["a", "b", "a", "a", "b", "a", "a", "b", "b", "a"]
ŷ = ["b", "a", "a", "b", "a", "b", "b", "b", "a", "a"]

julia> cm = ConfusionMatrix()(ŷ, y)  # or `confmat((ŷ, y)`.

              ┌───────────────────────────┐
              │       Ground Truth        │
┌─────────────┼─────────────┬─────────────┤
│  Predicted  │      a      │      b      │
├─────────────┼─────────────┼─────────────┤
│      a      │      2      │      3      │
├─────────────┼─────────────┼─────────────┤
│      b      │      4      │      1      │
└─────────────┴─────────────┴─────────────┘

julia> cm("a", "b")
3

Core algorithm: ConfusionMatrices.confmat.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Unoriented()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = confusion matrix
source
StatisticalMeasures.MisclassificationRateFunction
MisclassificationRate()

Return a callable measure for computing the misclassification rate. Aliases: misclassification_rate, mcr.

MisclassificationRate()(ŷ, y)
MisclassificationRate()(ŷ, y, weights)
MisclassificationRate()(ŷ, y, class_weights::AbstractDict)
MisclassificationRate()(ŷ, y, weights, class_weights::AbstractDict)

Evaluate MisclassificationRate() on predictions , given ground truth observations y. That, is, return the proportion of predictions ŷᵢ that are different from the corresponding ground truth yᵢ. More generally, average the specified weights over incorrectly identified observations. Can also be called on a confusion matrix. See ConfusionMatrix.

This metric is invariant to class reordering.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(MisclassificationRate(), ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Finite,Missing} (multiclass classification).

See also StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = misclassification rate
source
StatisticalMeasures.MultitargetMisclassificationRateFunction
MultitargetMisclassificationRate()

Return a callable measure for computing the multitarget misclassification rate. Aliases: multitarget_misclassification_rate, multitarget_mcr.

MultitargetMisclassificationRate()(ŷ, y)
MultitargetMisclassificationRate()(ŷ, y, weights)
MultitargetMisclassificationRate()(ŷ, y, class_weights::AbstractDict)
MultitargetMisclassificationRate()(ŷ, y, weights, class_weights::AbstractDict)

Evaluate MultitargetMisclassificationRate() on predictions , given ground truth observations y. Specifically, compute the multi-target version of MisclassificationRate. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The atomic_weights are weights for each component of the multi-target. Unless equal to nothing (uniform weights) the length of atomic_weights will generally match the number of columns of y, if y is a table, or the number of rows of y, if y is a matrix.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(MultitargetMisclassificationRate(), ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Finite,Missing} (multiclass classification). Alternatively, y and can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Finite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multitarget misclassification rate
source
StatisticalMeasures.AccuracyFunction
Accuracy()

Return a callable measure for computing the accuracy. Aliases: accuracy.

Accuracy()(ŷ, y)
Accuracy()(ŷ, y, weights)
Accuracy()(ŷ, y, class_weights::AbstractDict)
Accuracy()(ŷ, y, weights, class_weights::AbstractDict)

Evaluate Accuracy() on predictions , given ground truth observations y. That is, compute the proportion of predictions ŷᵢ that agree with the corresponding ground truth yᵢ. More generally, average the specified weights over all correctly predicted observations. Can also be called on a confusion matrix. See ConfusionMatrix.

This metric is invariant to class reordering.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(Accuracy(), ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Finite,Missing} (multiclass classification).

See also ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = accuracy
source
StatisticalMeasures.MultitargetAccuracyFunction
MultitargetAccuracy()

Return a callable measure for computing the multitarget accuracy. Aliases: multitarget_accuracy.

MultitargetAccuracy()(ŷ, y)
MultitargetAccuracy()(ŷ, y, weights)
MultitargetAccuracy()(ŷ, y, class_weights::AbstractDict)
MultitargetAccuracy()(ŷ, y, weights, class_weights::AbstractDict)

Evaluate MultitargetAccuracy() on predictions , given ground truth observations y. Specifically, compute the multi-target version of Accuracy. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The atomic_weights are weights for each component of the multi-target. Unless equal to nothing (uniform weights) the length of atomic_weights will generally match the number of columns of y, if y is a table, or the number of rows of y, if y is a matrix.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(MultitargetAccuracy(), ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Finite,Missing} (multiclass classification). Alternatively, y and can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Finite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multitarget accuracy
source
StatisticalMeasures.BalancedAccuracyFunction
BalancedAccuracy(; adjusted=false)

Return a callable measure for computing the balanced accuracy. Aliases: balanced_accuracy, bacc, bac, probability_of_correct_classification.

m(ŷ, y)
m(ŷ, y, weights)

Evaluate some measure m returned by the BalancedAccuracy constructor (e.g., m = BalancedAccuracy()) on predictions , given ground truth observations y. This is a variation of Accuracy compensating for class imbalance. See https://en.wikipedia.org/wiki/Precisionandrecall#Imbalanced_data.

Setting adjusted=true rescales the score in the way prescribed in L. Mosley (2013): A balanced approach to the multi-class imbalance problem. PhD thesis, Iowa State University. In the binary case, the adjusted balanced accuracy is also known as Youden’s J statistic, or informedness.

This metric is invariant to class reordering.

Any iterator with a length generating Real elements can be used for weights. Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Finite,Missing} (multiclass classification).

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = balanced accuracy
source
StatisticalMeasures.KappaFunction
Kappa()

Return a callable measure for computing the Cohen's κ. Aliases: kappa.

Kappa()(ŷ, y)
Kappa()(ŷ, y, weights)

Evaluate Kappa() on predictions , given ground truth observations y. For details, see the Cohen's κ Wikipedia article. Can also be called on confusion matrices. See ConfusionMatrix.

This metric is invariant to class reordering.

Any iterator with a length generating Real elements can be used for weights. Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Finite,Missing} (multiclass classification).

See also StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.kappa

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = Cohen's κ
source
StatisticalMeasures.MatthewsCorrelationFunction
MatthewsCorrelation()

Return a callable measure for computing the Matthew's correlation. Aliases: matthews_correlation, mcc.

MatthewsCorrelation()(ŷ, y)

Evaluate MatthewsCorrelation() on predictions , given ground truth observations y. See the Wikipedia Matthew's Correlation page. Can also be called on confusion matrices. See ConfusionMatrix.

This metric is invariant to class reordering.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Finite,Missing} (multiclass classification).

See also StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.matthews_correlation

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = Matthew's correlation
source
StatisticalMeasures.FScoreFunction
FScore(; beta=1.0, levels=nothing, rev=nothing, checks=true)

Return a callable measure for computing the $F_β$ score. Aliases: f1score.

m(ŷ, y)

Evaluate some measure m returned by the FScore constructor (e.g., m = FScore()) on predictions , given ground truth observations y. This is the one-parameter generalization, $F_β$, of the $F$-measure or balanced $F$-score. Choose beta=β in the range $[0,∞]$, using beta > 1 to emphasize recall (TruePositiveRate) over precision (PositivePredictiveValue). When beta = 1, the score is the harmonic mean of precision and recall. See the F1 score Wikipedia page for details.

If ordering classes (levels) on the basis of the eltype of y, then the second level is the "positive" class. To reverse roles, specify rev=true.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

FScore mesaures can also be called on a confusion matrix. See ConfusionMatrix.

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

See also StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.fscore

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = ``F_β`` score
source
StatisticalMeasures.TruePositiveFunction
TruePositive(; levels=nothing, rev=nothing, checks=true)

Return a callable measure for computing the true positive count. Aliases: true_positive, truepositive.

m(ŷ, y)

Evaluate some measure m returned by the TruePositive constructor (e.g., m = TruePositive()) on predictions , given ground truth observations y. When ordering classes (levels) on the basis of the eltype of y, the second level is the "positive" class. To reverse roles, specify rev=true.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. See ConfusionMatrix.

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

See also MulticlassTruePositive, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.true_positive

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = true positive count
source
StatisticalMeasures.TrueNegativeFunction
TrueNegative(; levels=nothing, rev=nothing, checks=true)

Return a callable measure for computing the true negative count. Aliases: true_negative, truenegative.

m(ŷ, y)

Evaluate some measure m returned by the TrueNegative constructor (e.g., m = TrueNegative()) on predictions , given ground truth observations y. When ordering classes (levels) on the basis of the eltype of y, the second level is the "positive" class. To reverse roles, specify rev=true.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. See ConfusionMatrix.

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

See also MulticlassTrueNegative, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.true_negative

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = true negative count
source
StatisticalMeasures.FalsePositiveFunction
FalsePositive(; levels=nothing, rev=nothing, checks=true)

Return a callable measure for computing the false positive count. Aliases: false_positive, falsepositive.

m(ŷ, y)

Evaluate some measure m returned by the FalsePositive constructor (e.g., m = FalsePositive()) on predictions , given ground truth observations y. When ordering classes (levels) on the basis of the eltype of y, the second level is the "positive" class. To reverse roles, specify rev=true.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. See ConfusionMatrix.

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

See also MulticlassFalsePositive, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.false_positive

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = false positive count
source
StatisticalMeasures.FalseNegativeFunction
FalseNegative(; levels=nothing, rev=nothing, checks=true)

Return a callable measure for computing the false negative count. Aliases: false_negative, falsenegative.

m(ŷ, y)

Evaluate some measure m returned by the FalseNegative constructor (e.g., m = FalseNegative()) on predictions , given ground truth observations y. When ordering classes (levels) on the basis of the eltype of y, the second level is the "positive" class. To reverse roles, specify rev=true.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. See ConfusionMatrix.

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

See also MulticlassFalseNegative, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.false_negative

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = false negative count
source
StatisticalMeasures.TruePositiveRateFunction
TruePositiveRate(; levels=nothing, rev=nothing, checks=true)

Return a callable measure for computing the true positive rate. Aliases: true_positive_rate, truepositive_rate, tpr, sensitivity, recall, hit_rate.

m(ŷ, y)

Evaluate some measure m returned by the TruePositiveRate constructor (e.g., m = TruePositiveRate()) on predictions , given ground truth observations y. When ordering classes (levels) on the basis of the eltype of y, the second level is the "positive" class. To reverse roles, specify rev=true.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. See ConfusionMatrix.

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

See also MulticlassTruePositiveRate, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.true_positive_rate

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = true positive rate
source
StatisticalMeasures.TrueNegativeRateFunction
TrueNegativeRate(; levels=nothing, rev=nothing, checks=true)

Return a callable measure for computing the true negative rate. Aliases: true_negative_rate, truenegative_rate, tnr, specificity, selectivity.

m(ŷ, y)

Evaluate some measure m returned by the TrueNegativeRate constructor (e.g., m = TrueNegativeRate()) on predictions , given ground truth observations y. When ordering classes (levels) on the basis of the eltype of y, the second level is the "positive" class. To reverse roles, specify rev=true.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. See ConfusionMatrix.

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

See also MulticlassTrueNegativeRate, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.true_negative_rate

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = true negative rate
source
StatisticalMeasures.FalsePositiveRateFunction
FalsePositiveRate(; levels=nothing, rev=nothing, checks=true)

Return a callable measure for computing the false positive rate. Aliases: false_positive_rate, falsepositive_rate, fpr, fallout.

m(ŷ, y)

Evaluate some measure m returned by the FalsePositiveRate constructor (e.g., m = FalsePositiveRate()) on predictions , given ground truth observations y. When ordering classes (levels) on the basis of the eltype of y, the second level is the "positive" class. To reverse roles, specify rev=true.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. See ConfusionMatrix.

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

See also MulticlassFalsePositiveRate, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.false_positive_rate

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = false positive rate
source
StatisticalMeasures.FalseNegativeRateFunction
FalseNegativeRate(; levels=nothing, rev=nothing, checks=true)

Return a callable measure for computing the false negative rate. Aliases: false_negative_rate, falsenegative_rate, fnr, miss_rate.

m(ŷ, y)

Evaluate some measure m returned by the FalseNegativeRate constructor (e.g., m = FalseNegativeRate()) on predictions , given ground truth observations y. When ordering classes (levels) on the basis of the eltype of y, the second level is the "positive" class. To reverse roles, specify rev=true.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. See ConfusionMatrix.

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

See also MulticlassFalseNegativeRate, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.false_negative_rate

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = false negative rate
source
StatisticalMeasures.FalseDiscoveryRateFunction
FalseDiscoveryRate(; levels=nothing, rev=nothing, checks=true)

Return a callable measure for computing the false discovery rate. Aliases: false_discovery_rate, falsediscovery_rate, fdr.

m(ŷ, y)

Evaluate some measure m returned by the FalseDiscoveryRate constructor (e.g., m = FalseDiscoveryRate()) on predictions , given ground truth observations y. When ordering classes (levels) on the basis of the eltype of y, the second level is the "positive" class. To reverse roles, specify rev=true.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. See ConfusionMatrix.

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

See also MulticlassFalseDiscoveryRate, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.false_discovery_rate

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = false discovery rate
source
StatisticalMeasures.PositivePredictiveValueFunction
PositivePredictiveValue(; levels=nothing, rev=nothing, checks=true)

Return a callable measure for computing the positive predictive value. Aliases: positive_predictive_value, ppv, positivepredictive_value, precision.

m(ŷ, y)

Evaluate some measure m returned by the PositivePredictiveValue constructor (e.g., m = PositivePredictiveValue()) on predictions , given ground truth observations y. When ordering classes (levels) on the basis of the eltype of y, the second level is the "positive" class. To reverse roles, specify rev=true.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. See ConfusionMatrix.

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

See also MulticlassPositivePredictiveValue, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.positive_predictive_value

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = positive predictive value
source
StatisticalMeasures.NegativePredictiveValueFunction
NegativePredictiveValue(; levels=nothing, rev=nothing, checks=true)

Return a callable measure for computing the negative predictive value. Aliases: negative_predictive_value, negativepredictive_value, npv.

m(ŷ, y)

Evaluate some measure m returned by the NegativePredictiveValue constructor (e.g., m = NegativePredictiveValue()) on predictions , given ground truth observations y. When ordering classes (levels) on the basis of the eltype of y, the second level is the "positive" class. To reverse roles, specify rev=true.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. See ConfusionMatrix.

Keyword options

  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

See also MulticlassNegativePredictiveValue, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.negative_predictive_value

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = negative predictive value
source
StatisticalMeasures.MulticlassTruePositiveFunction
MulticlassTruePositive(; levels=nothing, more_options...)

Return a callable measure for computing the multi-class true positive count. Aliases: multiclass_true_positive, multiclass_truepositive.

m(ŷ, y)

Evaluate some measure m returned by the MulticlassTruePositive constructor (e.g., m = MulticlassTruePositive()) on predictions , given ground truth observations y.

This is a one-versus-rest version of the binary measure TruePositive, returning a dictionary keyed on target class (level), or a vector (see options below), instead of a single number, even on binary data.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. Construct confusion matrices using ConfusionMatrix.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

Keyword options

  • return_type=LittleDict: type of returned measurement for average=NoAvg() case; if LittleDict, then keyed on levels of the target; can also be Vector
  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

See also TruePositive, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.multiclass_true_positive

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = multi-class true positive count
source
StatisticalMeasures.MulticlassTrueNegativeFunction
MulticlassTrueNegative(; levels=nothing, more_options...)

Return a callable measure for computing the multi-class true negative count. Aliases: multiclass_true_negative, multiclass_truenegative.

m(ŷ, y)

Evaluate some measure m returned by the MulticlassTrueNegative constructor (e.g., m = MulticlassTrueNegative()) on predictions , given ground truth observations y.

This is a one-versus-rest version of the binary measure TrueNegative, returning a dictionary keyed on target class (level), or a vector (see options below), instead of a single number, even on binary data.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. Construct confusion matrices using ConfusionMatrix.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

Keyword options

  • return_type=LittleDict: type of returned measurement for average=NoAvg() case; if LittleDict, then keyed on levels of the target; can also be Vector
  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

See also TrueNegative, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.multiclass_true_negative

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = multi-class true negative count
source
StatisticalMeasures.MulticlassFalsePositiveFunction
MulticlassFalsePositive(; levels=nothing, more_options...)

Return a callable measure for computing the multi-class false positive count. Aliases: multiclass_false_positive, multiclass_falsepositive.

m(ŷ, y)

Evaluate some measure m returned by the MulticlassFalsePositive constructor (e.g., m = MulticlassFalsePositive()) on predictions , given ground truth observations y.

This is a one-versus-rest version of the binary measure FalsePositive, returning a dictionary keyed on target class (level), or a vector (see options below), instead of a single number, even on binary data.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. Construct confusion matrices using ConfusionMatrix.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

Keyword options

  • return_type=LittleDict: type of returned measurement for average=NoAvg() case; if LittleDict, then keyed on levels of the target; can also be Vector
  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

See also FalsePositive, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.multiclass_false_positive

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = multi-class false positive count
source
StatisticalMeasures.MulticlassFalseNegativeFunction
MulticlassFalseNegative(; levels=nothing, more_options...)

Return a callable measure for computing the multi-class false negative count. Aliases: multiclass_false_negative, multiclass_falsenegative.

m(ŷ, y)

Evaluate some measure m returned by the MulticlassFalseNegative constructor (e.g., m = MulticlassFalseNegative()) on predictions , given ground truth observations y.

This is a one-versus-rest version of the binary measure FalseNegative, returning a dictionary keyed on target class (level), or a vector (see options below), instead of a single number, even on binary data.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

m can also be called on a confusion matrix. Construct confusion matrices using ConfusionMatrix.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

Keyword options

  • return_type=LittleDict: type of returned measurement for average=NoAvg() case; if LittleDict, then keyed on levels of the target; can also be Vector
  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

See also FalseNegative, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.multiclass_false_negative

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = multi-class false negative count
source
StatisticalMeasures.MulticlassTruePositiveRateFunction
MulticlassTruePositiveRate(; average=macro_avg, levels=nothing, more_options...)

Return a callable measure for computing the multi-class true positive rate. Aliases: multiclass_true_positive_rate, multiclass_truepositive_rate, multiclass_tpr, multiclass_sensitivity, multiclass_recall, multiclass_hit_rate.

m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)

Evaluate some measure m returned by the MulticlassTruePositiveRate constructor (e.g., m = MulticlassTruePositiveRate()) on predictions , given ground truth observations y.

This is an averaged one-versus-rest version of the binary TruePositiveRate. Or it can return a dictionary keyed on target class (or a vector); see average options below.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

You can also call m on confusion matrices. Construct confusion matrices using ConfusionMatrix.

The keys of class_weights should include all conceivable values for observations in y, and values should be Real. Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

Keyword options

  • average=MacroAvg(): one of: NoAvg(), MacroAvg(), MicroAvg() (names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1", arXiv.
  • return_type=LittleDict: type of returned measurement for average=NoAvg() case; if LittleDict, then keyed on levels of the target; can also be Vector
  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

See also TruePositiveRate, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.multiclass_true_positive_rate

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class true positive rate
source
StatisticalMeasures.MulticlassTrueNegativeRateFunction
MulticlassTrueNegativeRate(; average=macro_avg, levels=nothing, more_options...)

Return a callable measure for computing the multi-class true negative rate. Aliases: multiclass_true_negative_rate, multiclass_truenegative_rate, multiclass_tnr, multiclass_specificity, multiclass_selectivity.

m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)

Evaluate some measure m returned by the MulticlassTrueNegativeRate constructor (e.g., m = MulticlassTrueNegativeRate()) on predictions , given ground truth observations y.

This is an averaged one-versus-rest version of the binary TrueNegativeRate. Or it can return a dictionary keyed on target class (or a vector); see average options below.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

You can also call m on confusion matrices. Construct confusion matrices using ConfusionMatrix.

The keys of class_weights should include all conceivable values for observations in y, and values should be Real. Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

Keyword options

  • average=MacroAvg(): one of: NoAvg(), MacroAvg(), MicroAvg() (names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1", arXiv.
  • return_type=LittleDict: type of returned measurement for average=NoAvg() case; if LittleDict, then keyed on levels of the target; can also be Vector
  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

See also TrueNegativeRate, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.multiclass_true_negative_rate

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class true negative rate
source
StatisticalMeasures.MulticlassFalsePositiveRateFunction
MulticlassFalsePositiveRate(; average=macro_avg, levels=nothing, more_options...)

Return a callable measure for computing the multi-class false positive rate. Aliases: multiclass_false_positive_rate, multiclass_falsepositive_rate, multiclass_fpr, multiclass_fallout.

m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)

Evaluate some measure m returned by the MulticlassFalsePositiveRate constructor (e.g., m = MulticlassFalsePositiveRate()) on predictions , given ground truth observations y.

This is an averaged one-versus-rest version of the binary FalsePositiveRate. Or it can return a dictionary keyed on target class (or a vector); see average options below.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

You can also call m on confusion matrices. Construct confusion matrices using ConfusionMatrix.

The keys of class_weights should include all conceivable values for observations in y, and values should be Real. Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

Keyword options

  • average=MacroAvg(): one of: NoAvg(), MacroAvg(), MicroAvg() (names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1", arXiv.
  • return_type=LittleDict: type of returned measurement for average=NoAvg() case; if LittleDict, then keyed on levels of the target; can also be Vector
  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

See also FalsePositiveRate, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.multiclass_false_positive_rate

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class false positive rate
source
StatisticalMeasures.MulticlassFalseNegativeRateFunction
MulticlassFalseNegativeRate(; average=macro_avg, levels=nothing, more_options...)

Return a callable measure for computing the multi-class false negative rate. Aliases: multiclass_false_negative_rate, multiclass_falsenegative_rate, multiclass_fnr, multiclass_miss_rate.

m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)

Evaluate some measure m returned by the MulticlassFalseNegativeRate constructor (e.g., m = MulticlassFalseNegativeRate()) on predictions , given ground truth observations y.

This is an averaged one-versus-rest version of the binary FalseNegativeRate. Or it can return a dictionary keyed on target class (or a vector); see average options below.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

You can also call m on confusion matrices. Construct confusion matrices using ConfusionMatrix.

The keys of class_weights should include all conceivable values for observations in y, and values should be Real. Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

Keyword options

  • average=MacroAvg(): one of: NoAvg(), MacroAvg(), MicroAvg() (names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1", arXiv.
  • return_type=LittleDict: type of returned measurement for average=NoAvg() case; if LittleDict, then keyed on levels of the target; can also be Vector
  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

See also FalseNegativeRate, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.multiclass_false_negative_rate

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class false negative rate
source
StatisticalMeasures.MulticlassFalseDiscoveryRateFunction
MulticlassFalseDiscoveryRate(; average=macro_avg, levels=nothing, more_options...)

Return a callable measure for computing the multi-class false discovery rate. Aliases: multiclass_false_discovery_rate, multiclass_falsediscovery_rate, multiclass_fdr.

m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)

Evaluate some measure m returned by the MulticlassFalseDiscoveryRate constructor (e.g., m = MulticlassFalseDiscoveryRate()) on predictions , given ground truth observations y.

This is an averaged one-versus-rest version of the binary FalseDiscoveryRate. Or it can return a dictionary keyed on target class (or a vector); see average options below.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

You can also call m on confusion matrices. Construct confusion matrices using ConfusionMatrix.

The keys of class_weights should include all conceivable values for observations in y, and values should be Real. Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

Keyword options

  • average=MacroAvg(): one of: NoAvg(), MacroAvg(), MicroAvg() (names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1", arXiv.
  • return_type=LittleDict: type of returned measurement for average=NoAvg() case; if LittleDict, then keyed on levels of the target; can also be Vector
  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

See also FalseDiscoveryRate, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.multiclass_false_discovery_rate

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class false discovery rate
source
StatisticalMeasures.MulticlassPositivePredictiveValueFunction
MulticlassPositivePredictiveValue(; average=macro_avg, levels=nothing, more_options...)

Return a callable measure for computing the multi-class positive predictive value. Aliases: multiclass_positive_predictive_value, multiclass_ppv, multiclass_positivepredictive_value, multiclass_precision.

m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)

Evaluate some measure m returned by the MulticlassPositivePredictiveValue constructor (e.g., m = MulticlassPositivePredictiveValue()) on predictions , given ground truth observations y.

This is an averaged one-versus-rest version of the binary PositivePredictiveValue. Or it can return a dictionary keyed on target class (or a vector); see average options below.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

You can also call m on confusion matrices. Construct confusion matrices using ConfusionMatrix.

The keys of class_weights should include all conceivable values for observations in y, and values should be Real. Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

Keyword options

  • average=MacroAvg(): one of: NoAvg(), MacroAvg(), MicroAvg() (names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1", arXiv.
  • return_type=LittleDict: type of returned measurement for average=NoAvg() case; if LittleDict, then keyed on levels of the target; can also be Vector
  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

See also PositivePredictiveValue, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.multiclass_positive_predictive_value

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class positive predictive value
source
StatisticalMeasures.MulticlassNegativePredictiveValueFunction
MulticlassNegativePredictiveValue(; average=macro_avg, levels=nothing, more_options...)

Return a callable measure for computing the multi-class negative predictive value. Aliases: multiclass_negative_predictive_value, multiclass_negativepredictive_value, multiclass_npv.

m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)

Evaluate some measure m returned by the MulticlassNegativePredictiveValue constructor (e.g., m = MulticlassNegativePredictiveValue()) on predictions , given ground truth observations y.

This is an averaged one-versus-rest version of the binary NegativePredictiveValue. Or it can return a dictionary keyed on target class (or a vector); see average options below.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

You can also call m on confusion matrices. Construct confusion matrices using ConfusionMatrix.

The keys of class_weights should include all conceivable values for observations in y, and values should be Real. Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

Keyword options

  • average=MacroAvg(): one of: NoAvg(), MacroAvg(), MicroAvg() (names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1", arXiv.
  • return_type=LittleDict: type of returned measurement for average=NoAvg() case; if LittleDict, then keyed on levels of the target; can also be Vector
  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

See also NegativePredictiveValue, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.multiclass_negative_predictive_value

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class negative predictive value
source
StatisticalMeasures.MulticlassFScoreFunction
MulticlassFScore(; average=macro_avg, levels=nothing, more_options...)

Return a callable measure for computing the multi-class $F_β$ score. Aliases: macro_f1score, micro_f1score, multiclass_f1score.

m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)

Evaluate some measure m returned by the MulticlassFScore constructor (e.g., m = MulticlassFScore()) on predictions , given ground truth observations y.

This is an averaged one-versus-rest version of the binary FScore. Or it can return a dictionary keyed on target class (or a vector); see average options below.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

You can also call m on confusion matrices. Construct confusion matrices using ConfusionMatrix.

The keys of class_weights should include all conceivable values for observations in y, and values should be Real. Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{OrderedFactor{2},Missing} (binary classification where definition of "positive" class matters).

Keyword options

  • beta=1.0: parameter in the range $[0,∞]$, emphasizing recall over precision for beta > 1, except in the case average=MicroAvg(), when it has no effect.
  • average=MacroAvg(): one of: NoAvg(), MacroAvg(), MicroAvg() (names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1", arXiv.
  • return_type=LittleDict: type of returned measurement for average=NoAvg() case; if LittleDict, then keyed on levels of the target; can also be Vector
  • levels::Union{Vector,Nothing}=nothing: if nothing, levels are inferred from and y and, by default, ordered according to the element type of y.

  • rev=false: in the case of binary data, whether to reverse the levels (as inferred or specified); a nothing value is the same as false.

  • perm=nothing: in the general case, a permutation representing a re-ordering of levels (as inferred or specified); e.g., perm = [1,3,2] for data with three classes.
  • checks=true: when true, specified levels are checked to see they include all observed levels; set to false for speed.

Method is optimized for CategoricalArray inputs with levels inferred. In that case levels will be the complete internal class pool, and not just the observed levels.

See also FScore, StatisticalMeasures.ConfusionMatrices.ConfusionMatrix and ConfusionMatrix.

Core algorithm: Functions.multiclass_fscore

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class ``F_β`` score
source
StatisticalMeasures.AreaUnderCurveFunction
AreaUnderCurve()

Return a callable measure for computing the area under the receiver operator characteritic. Aliases: auc, area_under_curve.

AreaUnderCurve()(ŷ, y)

Evaluate AreaUnderCurve() on predictions , given ground truth observations y. See the Recevier operator chararacteristic (ROC) Wikipedia article for a definition. It is expected that be a vector of distributions over the binary set of unique elements of y; specifically, should have eltype <:UnivariateFinite from the CategoricalDistributions.jl package.

Implementation is based on the Mann-Whitney U statistic. See the Whitney U test Wikipedia page for details.

Core implementation: Functions.auc.

This metric is invariant to class reordering.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:ScientificTypesBase.Binary.

See also roc_curve.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = ScientificTypesBase.Binary
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = area under the receiver operator characteritic
source
StatisticalMeasures.LogScoreFunction
LogScore(; tol=eps())

Return a callable measure for computing the log score. Aliases: log_score.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the LogScore constructor (e.g., m = LogScore()) on predictions , given ground truth observations y. The score is a mean of observational scores. More generally, observational scores are pre-multiplied by the specified weights before averaging. See below for the form that probabilistic predictions should take. Raw probabilities are clamped away from 0 and 1. Specifically, if p is the probability mass/density function evaluated at given observed ground truth observation η, then the score for that example is defined as

log(clamp(p(η), tol, 1 - tol).

For example, for a binary target with "yes"/"no" labels, if the probabilistic prediction scores 0.8 for a "yes", then for a corresponding ground truth observation of "no", that example's contribution to the score is log(0.2).

The predictions should be a vector of UnivariateFinite distributions from CategoricalDistritutions.jl, in the case of Finite target y (a CategoricalVector) and should otherwise be a supported Distributions.UnivariateDistribution such as Normal or Poisson.

See also LogLoss, which differs only in sign.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ).

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = log score
source
StatisticalMeasures.LogLossFunction
LogLoss(; tol=eps())

Return a callable measure for computing the log loss. Aliases: log_loss, cross_entropy.

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)

Evaluate some measure m returned by the LogLoss constructor (e.g., m = LogLoss()) on predictions , given ground truth observations y. For details, see LogScore, which differs only by a sign.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(m, ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ).

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = log loss
source
StatisticalMeasures.BrierScoreFunction
BrierScore()

Return a callable measure for computing the brier score. Aliases: brier_score, quadratic_score.

BrierScore()(ŷ, y)
BrierScore()(ŷ, y, weights)
BrierScore()(ŷ, y, class_weights::AbstractDict)
BrierScore()(ŷ, y, weights, class_weights::AbstractDict)

Evaluate BrierScore() on predictions , given ground truth observations y. The score is a mean of observational scores. More generally, observational scores are pre-multiplied by the specified weights before averaging. See below for the form that probabilistic predictions should take.

Convention as in Gneiting and Raftery (2007), "StrictlyProper Scoring Rules, Prediction, and Estimation"

Finite case. If p(η) is the predicted probability for a single observation η, and C all possible classes, then the corresponding score for that example is given by

$2p(η) - \left(\sum_{c ∈ C} p(c)^2\right) - 1$

Warning. BrierScore() is a "score" in the sense that bigger is better (with 0 optimal, and all other values negative). In Brier's original 1950 paper, and many other places, it has the opposite sign, despite the name. Moreover, the present implementation does not treat the binary case as special, so that the score may differ in the binary case by a factor of two from usage elsewhere.

Infinite case. Replacing the sum above with an integral does not lead to the formula adopted here in the case of Continuous or Count target y. Rather the convention in the paper cited above is adopted, which means returning a score of

$2p(η) - ∫ p(t)^2 dt$

in the Continuous case (p the probablity density function) or

$2p(η) - ∑_t p(t)^2$

in the Count case (p the probablity mass function).

The predictions should be a vector of UnivariateFinite distributions from CategoricalDistritutions.jl, in the case of Finite target y (a CategoricalVector) and should otherwise be a supported Distributions.UnivariateDistribution such as Normal or Poisson.

See also BrierLoss, which differs only in sign.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(BrierScore(), ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ).

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = brier score
source
StatisticalMeasures.BrierLossFunction
BrierLoss()

Return a callable measure for computing the brier loss. Aliases: brier_loss, cross_entropy, quadratic_loss.

BrierLoss()(ŷ, y)
BrierLoss()(ŷ, y, weights)
BrierLoss()(ŷ, y, class_weights::AbstractDict)
BrierLoss()(ŷ, y, weights, class_weights::AbstractDict)

Evaluate BrierLoss() on predictions , given ground truth observations y. For details, see BrierScore, which differs only by a sign.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(BrierLoss(), ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ).

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = brier loss
source
StatisticalMeasures.SphericalScoreFunction
SphericalScore()

Return a callable measure for computing the spherical score. Aliases: spherical_score.

SphericalScore()(ŷ, y)
SphericalScore()(ŷ, y, weights)
SphericalScore()(ŷ, y, class_weights::AbstractDict)
SphericalScore()(ŷ, y, weights, class_weights::AbstractDict)

Evaluate SphericalScore() on predictions , given ground truth observations y. The score is a mean of observational scores. More generally, observational scores are pre-multiplied by the specified weights before averaging. See below for the form that probabilistic predictions should take.

Convention as in Gneiting and Raftery (2007), "StrictlyProper Scoring Rules, Prediction, and Estimation": If y takes on a finite number of classes C and p(y) is the predicted probability for a single observation y, then the corresponding score for that example is given by

$p(y)^α / \left(\sum_{η ∈ C} p(η)^α\right)^{1-α} - 1$

where α is the measure parameter alpha.

In the case the predictions are continuous probability distributions, such as Distributions.Normal, replace the above sum with an integral, and interpret p as the probablity density function. In case of discrete distributions over the integers, such as Distributions.Poisson, sum over all integers instead of C.

Any iterator with a length generating Real elements can be used for weights. The keys of class_weights should include all conceivable values for observations in y, and values should be Real.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax measurements(SphericalScore(), ŷ, y). Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ).

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = spherical score
source