# The Measures

### Quick links

- List of aliases
- Classification measures (non-probabilistic)
- Regression measures (non-probabilistic)
- Probabilistic measures

## Scientific type of observations

Measures can be classified according to the scientific type of the target observations they consume (given by the value of the trait, `StatisticalMeasuresBase.observation_scitype(measure)`

):

observation scitype | meaning |
---|---|

`Finite` | general classification |

`Finite{2}=Binary` | binary classification |

`OrderedFactor` | classification (class order matters) |

`OrderedFactor{2}` | binary classification (order matters) |

`Continuous` | regression |

`Infinite` | regression, including integer targets for `Count` data |

`AbstractArray{T}` | multitarget version of `T` , some tabular data okay |

Measures are not strict about data conforming to the declared observation scitype. For example, where `OrderedFactor{2}`

is expected, `Finite{2}`

will work, and in fact most eltypes will work, so long as there are only two classes. However, you may get warnings that mitigate possible misinterpretations of results (e.g., about which class is the "positive" one). Some warnings can be suppressed by explicitly specifying measure parameters, such as `levels`

.

To be 100% safe and avoid warnings, use data with the recommended observation scitype.

## On multi-target measures and tabular data

All multi-target measures below (the ones with `AbstractArray`

observation scitypes) also handle some forms of tabular input, including `DataFrame`

s and Julia's native "row table" and "column table" formats. This is not reflected by the declared observation scitype. Instead, you can inspect the trait `StatisticalMeasuresBase.can_consume_tables`

or consult the measure document string.

## Classification measures (non-probabilistic)

constructor / instance aliases | observation scitype |
---|---|

`FScore` | `Union{Missing, OrderedFactor{2}}` |

`FalseDiscoveryRate` | `Union{Missing, OrderedFactor{2}}` |

`FalseNegative` | `Union{Missing, OrderedFactor{2}}` |

`FalseNegativeRate` | `Union{Missing, OrderedFactor{2}}` |

`FalsePositive` | `Union{Missing, OrderedFactor{2}}` |

`FalsePositiveRate` | `Union{Missing, OrderedFactor{2}}` |

`NegativePredictiveValue` | `Union{Missing, OrderedFactor{2}}` |

`PositivePredictiveValue` | `Union{Missing, OrderedFactor{2}}` |

`TrueNegative` | `Union{Missing, OrderedFactor{2}}` |

`TrueNegativeRate` | `Union{Missing, OrderedFactor{2}}` |

`TruePositive` | `Union{Missing, OrderedFactor{2}}` |

`TruePositiveRate` | `Union{Missing, OrderedFactor{2}}` |

`Accuracy` | `Union{Missing, Finite}` |

`BalancedAccuracy` | `Union{Missing, Finite}` |

`ConfusionMatrix` | `Union{Missing, Finite}` |

`Kappa` | `Union{Missing, Finite}` |

`MatthewsCorrelation` | `Union{Missing, Finite}` |

`MisclassificationRate` | `Union{Missing, Finite}` |

`MulticlassFScore` | `Union{Missing, Finite}` |

`MulticlassFalseDiscoveryRate` | `Union{Missing, Finite}` |

`MulticlassFalseNegative` | `Union{Missing, Finite}` |

`MulticlassFalseNegativeRate` | `Union{Missing, Finite}` |

`MulticlassFalsePositive` | `Union{Missing, Finite}` |

`MulticlassFalsePositiveRate` | `Union{Missing, Finite}` |

`MulticlassNegativePredictiveValue` | `Union{Missing, Finite}` |

`MulticlassPositivePredictiveValue` | `Union{Missing, Finite}` |

`MulticlassTrueNegative` | `Union{Missing, Finite}` |

`MulticlassTrueNegativeRate` | `Union{Missing, Finite}` |

`MulticlassTruePositive` | `Union{Missing, Finite}` |

`MulticlassTruePositiveRate` | `Union{Missing, Finite}` |

`MultitargetAccuracy` | `AbstractArray{<:Union{Missing, Finite}}` |

`MultitargetMisclassificationRate` | `AbstractArray{<:Union{Missing, Finite}}` |

## Regression measures (non-probabilistic)

constructor / instance aliases | observation scitype |
---|---|

`LPLoss` | `Union{Missing, Infinite}` |

`LPSumLoss` | `Union{Missing, Infinite}` |

`LogCoshLoss` | `Union{Missing, Infinite}` |

`MeanAbsoluteProportionalError` | `Union{Missing, Infinite}` |

`RSquared` | `Union{Missing, Infinite}` |

`RootMeanSquaredError` | `Union{Missing, Infinite}` |

`RootMeanSquaredLogError` | `Union{Missing, Infinite}` |

`RootMeanSquaredLogProportionalError` | `Union{Missing, Infinite}` |

`RootMeanSquaredProportionalError` | `Union{Missing, Infinite}` |

`MultitargetLPLoss` | `AbstractArray{<:Union{Missing, Infinite}}` |

`MultitargetLPSumLoss` | `AbstractArray{<:Union{Missing, Infinite}}` |

`MultitargetLogCoshLoss` | `AbstractArray{<:Union{Missing, Infinite}}` |

`MultitargetMeanAbsoluteProportionalError` | `AbstractArray{<:Union{Missing, Infinite}}` |

`MultitargetRootMeanSquaredError` | `AbstractArray{<:Union{Missing, Infinite}}` |

`MultitargetRootMeanSquaredLogError` | `AbstractArray{<:Union{Missing, Infinite}}` |

`MultitargetRootMeanSquaredLogProportionalError` | `AbstractArray{<:Union{Missing, Infinite}}` |

`MultitargetRootMeanSquaredProportionalError` | `AbstractArray{<:Union{Missing, Infinite}}` |

## Probabilistic measures

These are measures where each prediction is a probability mass or density function, over the space of possible ground truth observations. Specifically, `StatisticalMeasuresBase.kind_of_proxy(measure)`

`== LearnAPI.Distribution()`

.

constructor / instance aliases | observation scitype |
---|---|

`BrierLoss` | `Union{Missing, Infinite, Finite}` |

`BrierScore` | `Union{Missing, Infinite, Finite}` |

`LogLoss` | `Union{Missing, Infinite, Finite}` |

`LogScore` | `Union{Missing, Infinite, Finite}` |

`SphericalScore` | `Union{Missing, Infinite, Finite}` |

`AreaUnderCurve` | `Binary` |

## List of aliases

Some of the measures constructed using specific parameter values have pre-defined names associated with them that are exported by StatisticalMeasures.jl These are called *aliases*.

## Reference

`StatisticalMeasures.LPLoss`

— Function`LPLoss(; p=2)`

Return a callable measure for computing the $L^p$ loss. Aliases: `l1`

, `l2`

, `mae`

, `mav`

, `mean_absolute_error`

, `mean_absolute_value`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `LPLoss`

constructor (e.g., `m = LPLoss()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, return the mean of $|ŷ_i - y_i|^p$ over all pairs of observations $(ŷ_i, y_i)$ in `(ŷ, y)`

, or more generally, the mean of weighted versions of those values. For the weighted *sum* use `LPSumLoss`

instead.

Any iterator with a `length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = ``L^p`` loss
```

`StatisticalMeasures.MultitargetLPLoss`

— Function`MultitargetLPLoss(; p=2, atomic_weights=nothing)`

Return a callable measure for computing the multitarget $L^p$ loss. Aliases: `multitarget_l1`

, `multitarget_l2`

, `multitarget_mae`

, `multitarget_mav`

, `multitarget_mean_absolute_error`

, `multitarget_mean_absolute_value`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MultitargetLPLoss`

constructor (e.g., `m = MultitargetLPLoss()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the multi-target version of `LPLoss`

. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The `atomic_weights`

are weights for each component of the multi-target. Unless equal to `nothing`

(uniform weights) the length of `atomic_weights`

will generally match the number of columns of `y`

, if `y`

is a table, or the number of rows of `y`

, if `y`

is a matrix.

Any iterator with a `length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`AbstractArray{<:Union{Missing,Infinite}}`

. Alternatively, `y`

and `ŷ`

can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multitarget ``L^p`` loss
```

`StatisticalMeasures.LPSumLoss`

— Function`LPSumLoss(; p=2)`

Return a callable measure for computing the $L^p$ sum loss. Aliases: `l1_sum`

, `l2_sum`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `LPSumLoss`

constructor (e.g., `m = LPSumLoss()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the (weighted) sum of $|ŷ_i - yᵢ|^p$ over all pairs of observations $(ŷ_i, yᵢ)$ in `(ŷ, y)`

. For the weighted *mean* use `LPLoss`

instead.

Any iterator with a `length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = ``L^p`` sum loss
```

`StatisticalMeasures.MultitargetLPSumLoss`

— Function`MultitargetLPSumLoss(; p=2, atomic_weights=nothing)`

Return a callable measure for computing the multitarget $L^p$ sum loss. Aliases: `multitarget_l1_sum`

, `multitarget_l2_sum`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MultitargetLPSumLoss`

constructor (e.g., `m = MultitargetLPSumLoss()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the multi-target version of `LPSumLoss`

. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The `atomic_weights`

are weights for each component of the multi-target. Unless equal to `nothing`

(uniform weights) the length of `atomic_weights`

will generally match the number of columns of `y`

, if `y`

is a table, or the number of rows of `y`

, if `y`

is a matrix.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`AbstractArray{<:Union{Missing,Infinite}}`

. Alternatively, `y`

and `ŷ`

can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = multitarget ``L^p`` sum loss
```

`StatisticalMeasures.RootMeanSquaredError`

— Function`RootMeanSquaredError()`

Return a callable measure for computing the root mean squared error. Aliases: `rms`

, `rmse`

, `root_mean_squared_error`

.

```
RootMeanSquaredError()(ŷ, y)
RootMeanSquaredError()(ŷ, y, weights)
RootMeanSquaredError()(ŷ, y, class_weights::AbstractDict)
RootMeanSquaredError()(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate `RootMeanSquaredError()`

on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the mean of $|y_i-ŷ_i|^2$ over all pairs of observations $(ŷ_i, y_i)$ in `(ŷ, y)`

, and return the square root of the result. More generally, pre-multiply the squared deviations by the specified weights.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(RootMeanSquaredError(), ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = root mean squared error
```

`StatisticalMeasures.MultitargetRootMeanSquaredError`

— Function`MultitargetRootMeanSquaredError(; atomic_weights=nothing)`

Return a callable measure for computing the multitarget root mean squared error. Aliases: `multitarget_rms`

, `multitarget_rmse`

, `multitarget_root_mean_squared_error`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MultitargetRootMeanSquaredError`

constructor (e.g., `m = MultitargetRootMeanSquaredError()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the multi-target version of `RootMeanSquaredError`

. Some kinds of tabular input are supported.

In array arguments the last dimension is understood to be the observation dimension. The `atomic_weights`

are weights for each component of the multi-target. Unless equal to `nothing`

(uniform weights) the length of `atomic_weights`

will generally match the number of columns of `y`

, if `y`

is a table, or the number of rows of `y`

, if `y`

is a matrix.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`AbstractArray{<:Union{Missing,Infinite}}`

. Alternatively, `y`

and `ŷ`

can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = multitarget root mean squared error
```

`StatisticalMeasures.RootMeanSquaredLogError`

— Function`RootMeanSquaredLogError()`

Return a callable measure for computing the root mean squared log error. Aliases: `rmsl`

, `rmsle`

, `root_mean_squared_log_error`

.

```
RootMeanSquaredLogError()(ŷ, y)
RootMeanSquaredLogError()(ŷ, y, weights)
RootMeanSquaredLogError()(ŷ, y, class_weights::AbstractDict)
RootMeanSquaredLogError()(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate `RootMeanSquaredLogError()`

on predictions `ŷ`

, given ground truth observations `y`

. Specifically, return the mean of $(\log(y)_i - \log(ŷ_i))^2$ over all pairs of observations $(ŷ_i, y_i)$ in `(ŷ, y)`

, and return the square root of the result. More generally, pre-multiply the values averaged by the specified weights. To include an offset, use `RootMeanSquaredLogProportionalError`

instead.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(RootMeanSquaredLogError(), ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = root mean squared log error
```

`StatisticalMeasures.MultitargetRootMeanSquaredLogError`

— Function`MultitargetRootMeanSquaredLogError(; atomic_weights=nothing)`

Return a callable measure for computing the multitarget root mean squared log error. Aliases: `multitarget_rmsl`

, `multitarget_rmsle`

, `multitarget_root_mean_squared_log_error`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MultitargetRootMeanSquaredLogError`

constructor (e.g., `m = MultitargetRootMeanSquaredLogError()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the multi-target version of `RootMeanSquaredLogError`

. Some kinds of tabular input are supported.

`atomic_weights`

are weights for each component of the multi-target. Unless equal to `nothing`

(uniform weights) the length of `atomic_weights`

will generally match the number of columns of `y`

, if `y`

is a table, or the number of rows of `y`

, if `y`

is a matrix.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

`measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`AbstractArray{<:Union{Missing,Infinite}}`

. Alternatively, `y`

and `ŷ`

can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = multitarget root mean squared log error
```

`StatisticalMeasures.RootMeanSquaredLogProportionalError`

— Function`RootMeanSquaredLogProportionalError(; offset=1)`

Return a callable measure for computing the root mean squared log proportional error. Aliases: `rmslp1`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `RootMeanSquaredLogProportionalError`

constructor (e.g., `m = RootMeanSquaredLogProportionalError()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the mean of $(\log(ŷ_i + δ) - \log(y_i + δ))^2$ over all pairs of observations $(ŷ_i, y_i)$ in `(ŷ, y)`

, and return the square root. More generally, pre-multiply the values averaged by the specified weights. Here $δ$=`offset`

, which is `1`

by default. This is the same as `RootMeanSquaredLogError`

but adds an offset.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = root mean squared log proportional error
```

`StatisticalMeasures.MultitargetRootMeanSquaredLogProportionalError`

— Function`MultitargetRootMeanSquaredLogProportionalError(; offset=1, atomic_weights=nothing)`

Return a callable measure for computing the multitarget root mean squared log proportional error. Aliases: `multitarget_rmslp1`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MultitargetRootMeanSquaredLogProportionalError`

constructor (e.g., `m = MultitargetRootMeanSquaredLogProportionalError()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the multi-target version of `RootMeanSquaredLogProportionalError`

. Some kinds of tabular input are supported.

`atomic_weights`

are weights for each component of the multi-target. Unless equal to `nothing`

(uniform weights) the length of `atomic_weights`

will generally match the number of columns of `y`

, if `y`

is a table, or the number of rows of `y`

, if `y`

is a matrix.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

. Alternatively, `y`

and `ŷ`

can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = multitarget root mean squared log proportional error
```

`StatisticalMeasures.RootMeanSquaredProportionalError`

— Function`RootMeanSquaredProportionalError(; tol=eps())`

Return a callable measure for computing the root mean squared proportional error. Aliases: `rmsp`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `RootMeanSquaredProportionalError`

constructor (e.g., `m = RootMeanSquaredProportionalError()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the mean of `((ŷᵢ-yᵢ)/yᵢ)^2}`

over all pairs of observations `(ŷᵢ, yᵢ)`

in `(ŷ, y)`

, and return the square root of the result. More generally, pre-multiply the values averaged by the specified weights. Terms for which `abs(yᵢ) < tol`

are dropped in the summation, but counts still contribute to the mean normalization factor.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

`measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = root mean squared proportional error
```

`StatisticalMeasures.MultitargetRootMeanSquaredProportionalError`

— Function`MultitargetRootMeanSquaredProportionalError(; tol=eps(), atomic_weights=nothing)`

Return a callable measure for computing the multitarget root mean squared proportional error. Aliases: `multitarget_rmsp`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MultitargetRootMeanSquaredProportionalError`

constructor (e.g., `m = MultitargetRootMeanSquaredProportionalError()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the multi-target version of `RootMeanSquaredProportionalError`

. Some kinds of tabular input are supported.

`atomic_weights`

are weights for each component of the multi-target. Unless equal to `nothing`

(uniform weights) the length of `atomic_weights`

will generally match the number of columns of `y`

, if `y`

is a table, or the number of rows of `y`

, if `y`

is a matrix.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

. Alternatively, `y`

and `ŷ`

can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.RootMean{Int64}(2)
human_name = multitarget root mean squared proportional error
```

`StatisticalMeasures.MeanAbsoluteProportionalError`

— Function`MeanAbsoluteProportionalError(; tol=eps())`

Return a callable measure for computing the mean absolute proportional error. Aliases: `mape`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MeanAbsoluteProportionalError`

constructor (e.g., `m = MeanAbsoluteProportionalError()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, return the mean of $|ŷ_i-y_i| \over |y_i|$ over all pairs of observations $(ŷ_i, y_i)$ in `(ŷ, y)`

. More generally, pre-multiply the values averaged by the specified weights. Terms for which $|y_i|$<`tol`

are dropped in the summation, but corresponding weights (or counts) still contribute to the mean normalization factor.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

`measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = mean absolute proportional error
```

`StatisticalMeasures.MultitargetMeanAbsoluteProportionalError`

— Function`MultitargetMeanAbsoluteProportionalError(; tol=eps(), atomic_weights=nothing)`

Return a callable measure for computing the multitarget mean absolute proportional error. Aliases: `multitarget_mape`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MultitargetMeanAbsoluteProportionalError`

constructor (e.g., `m = MultitargetMeanAbsoluteProportionalError()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the multi-target version of `MeanAbsoluteProportionalError`

. Some kinds of tabular input are supported.

`atomic_weights`

are weights for each component of the multi-target. Unless equal to `nothing`

(uniform weights) the length of `atomic_weights`

will generally match the number of columns of `y`

, if `y`

is a table, or the number of rows of `y`

, if `y`

is a matrix.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

. Alternatively, `y`

and `ŷ`

can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multitarget mean absolute proportional error
```

`StatisticalMeasures.LogCoshLoss`

— Function`LogCoshLoss()`

Return a callable measure for computing the log cosh loss. Aliases: `log_cosh`

, `log_cosh_loss`

.

```
LogCoshLoss()(ŷ, y)
LogCoshLoss()(ŷ, y, weights)
LogCoshLoss()(ŷ, y, class_weights::AbstractDict)
LogCoshLoss()(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate `LogCoshLoss()`

on predictions `ŷ`

, given ground truth observations `y`

. Return the mean of $\log(\cosh(ŷ_i-y_i))$ over all pairs of observations $(ŷ_i, y_i)$ in `(ŷ, y)`

. More generally, pre-multiply the values averaged by the specified weights.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(LogCoshLoss(), ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = log cosh loss
```

`StatisticalMeasures.MultitargetLogCoshLoss`

— Function`MultitargetLogCoshLoss(; atomic_weights=nothing)`

Return a callable measure for computing the multitarget log cosh loss. Aliases: `multitarget_mape`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MultitargetLogCoshLoss`

constructor (e.g., `m = MultitargetLogCoshLoss()`

) on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the multi-target version of `LogCoshLoss`

. Some kinds of tabular input are supported.

`atomic_weights`

are weights for each component of the multi-target. Unless equal to `nothing`

(uniform weights) the length of `atomic_weights`

will generally match the number of columns of `y`

, if `y`

is a table, or the number of rows of `y`

, if `y`

is a matrix.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

`measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

. Alternatively, `y`

and `ŷ`

can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Infinite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multitarget log cosh loss
```

`StatisticalMeasures.RSquared`

— Function`RSquared()`

Return a callable measure for computing the R² coefficient. Aliases: `rsq`

, `rsquared`

.

`RSquared()(ŷ, y)`

Evaluate `RSquared()`

on predictions `ŷ`

, given ground truth observations `y`

. Specifically, return the value of

$1 - \frac{∑ᵢ (ŷ_i- y_i)^2}{∑ᵢ ȳ - y_i)^2},$

where $ȳ$ denote the mean of the $y_i$. Also known as R-squared or the coefficient of determination, the `R²`

coefficients is suitable for interpreting linear regression analysis (Chicco et al., 2021).

Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Infinite,Missing}`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = R² coefficient
```

`StatisticalMeasures.ConfusionMatrix`

— Function`ConfusionMatrix(; levels=nothing, rev=false, perm=nothing, checks=true)`

Return a callable measure for computing the confusion matrix. Aliases: `confmat`

, `confusion_matrix`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `ConfusionMatrix`

constructor (e.g., `m = ConfusionMatrix()`

) on predictions `ŷ`

, given ground truth observations `y`

. See the *Confusion matrix* wikipedia article.

Elements of a confusion matrix can always be accessed by level - see the example below. To flag the confusion matrix as ordered, and hence index-accessible, do one of the following:

Supply ordered

`CategoricalArray`

inputs`ŷ`

and`y`

Explicitly specify

`levels`

or one of`rev`

,`perm`

Note that `==`

for two confusion matrices is stricter when both are ordered.

Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Finite,Missing}`

(multiclass classification).

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

Method is optimized for `CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

For more on the type of object returned and its interface, see `ConfusionMatrices.ConfusionMatrix`

.

**Example**

```
using StatisticalMeasures
y = ["a", "b", "a", "a", "b", "a", "a", "b", "b", "a"]
ŷ = ["b", "a", "a", "b", "a", "b", "b", "b", "a", "a"]
julia> cm = ConfusionMatrix()(ŷ, y) # or `confmat((ŷ, y)`.
┌───────────────────────────┐
│ Ground Truth │
┌─────────────┼─────────────┬─────────────┤
│ Predicted │ a │ b │
├─────────────┼─────────────┼─────────────┤
│ a │ 2 │ 3 │
├─────────────┼─────────────┼─────────────┤
│ b │ 4 │ 1 │
└─────────────┴─────────────┴─────────────┘
julia> cm("a", "b")
3
```

Core algorithm: `ConfusionMatrices.confmat`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Unoriented()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = confusion matrix
```

`StatisticalMeasures.MisclassificationRate`

— Function`MisclassificationRate()`

Return a callable measure for computing the misclassification rate. Aliases: `misclassification_rate`

, `mcr`

.

```
MisclassificationRate()(ŷ, y)
MisclassificationRate()(ŷ, y, weights)
MisclassificationRate()(ŷ, y, class_weights::AbstractDict)
MisclassificationRate()(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate `MisclassificationRate()`

on predictions `ŷ`

, given ground truth observations `y`

. That, is, return the proportion of predictions `ŷᵢ`

that are different from the corresponding ground truth `yᵢ`

. More generally, average the specified weights over incorrectly identified observations. Can also be called on a confusion matrix. See `ConfusionMatrix`

.

This metric is invariant to class reordering.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(MisclassificationRate(), ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Finite,Missing}`

(multiclass classification).

See also `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = misclassification rate
```

`StatisticalMeasures.MultitargetMisclassificationRate`

— Function`MultitargetMisclassificationRate()`

Return a callable measure for computing the multitarget misclassification rate. Aliases: `multitarget_misclassification_rate`

, `multitarget_mcr`

.

```
MultitargetMisclassificationRate()(ŷ, y)
MultitargetMisclassificationRate()(ŷ, y, weights)
MultitargetMisclassificationRate()(ŷ, y, class_weights::AbstractDict)
MultitargetMisclassificationRate()(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate `MultitargetMisclassificationRate()`

on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the multi-target version of `MisclassificationRate`

. Some kinds of tabular input are supported.

`atomic_weights`

are weights for each component of the multi-target. Unless equal to `nothing`

(uniform weights) the length of `atomic_weights`

will generally match the number of columns of `y`

, if `y`

is a table, or the number of rows of `y`

, if `y`

is a matrix.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(MultitargetMisclassificationRate(), ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Finite,Missing}`

(multiclass classification). Alternatively, `y`

and `ŷ`

can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Finite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multitarget misclassification rate
```

`StatisticalMeasures.Accuracy`

— Function`Accuracy()`

Return a callable measure for computing the accuracy. Aliases: `accuracy`

.

```
Accuracy()(ŷ, y)
Accuracy()(ŷ, y, weights)
Accuracy()(ŷ, y, class_weights::AbstractDict)
Accuracy()(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate `Accuracy()`

on predictions `ŷ`

, given ground truth observations `y`

. That is, compute the proportion of predictions `ŷᵢ`

that agree with the corresponding ground truth `yᵢ`

. More generally, average the specified weights over all correctly predicted observations. Can also be called on a confusion matrix. See `ConfusionMatrix`

.

This metric is invariant to class reordering.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(Accuracy(), ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Finite,Missing}`

(multiclass classification).

See also `ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = accuracy
```

`StatisticalMeasures.MultitargetAccuracy`

— Function`MultitargetAccuracy()`

Return a callable measure for computing the multitarget accuracy. Aliases: `multitarget_accuracy`

.

```
MultitargetAccuracy()(ŷ, y)
MultitargetAccuracy()(ŷ, y, weights)
MultitargetAccuracy()(ŷ, y, class_weights::AbstractDict)
MultitargetAccuracy()(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate `MultitargetAccuracy()`

on predictions `ŷ`

, given ground truth observations `y`

. Specifically, compute the multi-target version of `Accuracy`

. Some kinds of tabular input are supported.

`atomic_weights`

are weights for each component of the multi-target. Unless equal to `nothing`

(uniform weights) the length of `atomic_weights`

will generally match the number of columns of `y`

, if `y`

is a table, or the number of rows of `y`

, if `y`

is a matrix.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(MultitargetAccuracy(), ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Finite,Missing}`

(multiclass classification). Alternatively, `y`

and `ŷ`

can be some types of table, provided elements have the approprate scitype.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = AbstractArray{<:Union{Missing, ScientificTypesBase.Finite}}
can_consume_tables = true
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multitarget accuracy
```

`StatisticalMeasures.BalancedAccuracy`

— Function`BalancedAccuracy(; adjusted=false)`

Return a callable measure for computing the balanced accuracy. Aliases: `balanced_accuracy`

, `bacc`

, `bac`

, `probability_of_correct_classification`

.

```
m(ŷ, y)
m(ŷ, y, weights)
```

Evaluate some measure `m`

returned by the `BalancedAccuracy`

constructor (e.g., `m = BalancedAccuracy()`

) on predictions `ŷ`

, given ground truth observations `y`

. This is a variation of `Accuracy`

compensating for class imbalance. See https://en.wikipedia.org/wiki/Precision*and*recall#Imbalanced_data.

Setting `adjusted=true`

rescales the score in the way prescribed in L. Mosley (2013): A balanced approach to the multi-class imbalance problem. PhD thesis, Iowa State University. In the binary case, the adjusted balanced accuracy is also known as *Youden’s J statistic*, or *informedness*.

This metric is invariant to class reordering.

Any iterator with a `length`

generating `Real`

elements can be used for `weights`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Finite,Missing}`

(multiclass classification).

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = balanced accuracy
```

`StatisticalMeasures.Kappa`

— Function`Kappa()`

Return a callable measure for computing the Cohen's κ. Aliases: `kappa`

.

```
Kappa()(ŷ, y)
Kappa()(ŷ, y, weights)
```

Evaluate `Kappa()`

on predictions `ŷ`

, given ground truth observations `y`

. For details, see the Cohen's κ Wikipedia article. Can also be called on confusion matrices. See `ConfusionMatrix`

.

This metric is invariant to class reordering.

Any iterator with a `length`

generating `Real`

elements can be used for `weights`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Finite,Missing}`

(multiclass classification).

See also `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.kappa`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = Cohen's κ
```

`StatisticalMeasures.MatthewsCorrelation`

— Function`MatthewsCorrelation()`

Return a callable measure for computing the Matthew's correlation. Aliases: `matthews_correlation`

, `mcc`

.

`MatthewsCorrelation()(ŷ, y)`

Evaluate `MatthewsCorrelation()`

on predictions `ŷ`

, given ground truth observations `y`

. See the Wikipedia *Matthew's Correlation* page. Can also be called on confusion matrices. See `ConfusionMatrix`

.

This metric is invariant to class reordering.

Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Finite,Missing}`

(multiclass classification).

See also `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.matthews_correlation`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = Matthew's correlation
```

`StatisticalMeasures.FScore`

— Function`FScore(; beta=1.0, levels=nothing, rev=nothing, checks=true)`

Return a callable measure for computing the $F_β$ score. Aliases: `f1score`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `FScore`

constructor (e.g., `m = FScore()`

) on predictions `ŷ`

, given ground truth observations `y`

. This is the one-parameter generalization, $F_β$, of the $F$-measure or balanced $F$-score. Choose `beta=β`

in the range $[0,∞]$, using `beta > 1`

to emphasize recall (`TruePositiveRate`

) over precision (`PositivePredictiveValue`

). When `beta = 1`

, the score is the harmonic mean of precision and recall. See the *F1 score* Wikipedia page for details.

If ordering classes (levels) on the basis of the eltype of `y`

, then the *second* level is the "positive" class. To reverse roles, specify `rev=true`

.

Method is optimized for `CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`FScore`

mesaures can also be called on a confusion matrix. See `ConfusionMatrix`

.

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

See also `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.fscore`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = ``F_β`` score
```

`StatisticalMeasures.TruePositive`

— Function`TruePositive(; levels=nothing, rev=nothing, checks=true)`

Return a callable measure for computing the true positive count. Aliases: `true_positive`

, `truepositive`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `TruePositive`

constructor (e.g., `m = TruePositive()`

) on predictions `ŷ`

, given ground truth observations `y`

. When ordering classes (levels) on the basis of the eltype of `y`

, the *second* level is the "positive" class. To reverse roles, specify `rev=true`

.

Method is optimized for `CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. See `ConfusionMatrix`

.

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

See also `MulticlassTruePositive`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.true_positive`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = true positive count
```

`StatisticalMeasures.TrueNegative`

— Function`TrueNegative(; levels=nothing, rev=nothing, checks=true)`

Return a callable measure for computing the true negative count. Aliases: `true_negative`

, `truenegative`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `TrueNegative`

constructor (e.g., `m = TrueNegative()`

) on predictions `ŷ`

, given ground truth observations `y`

. When ordering classes (levels) on the basis of the eltype of `y`

, the *second* level is the "positive" class. To reverse roles, specify `rev=true`

.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. See `ConfusionMatrix`

.

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

See also `MulticlassTrueNegative`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.true_negative`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = true negative count
```

`StatisticalMeasures.FalsePositive`

— Function`FalsePositive(; levels=nothing, rev=nothing, checks=true)`

Return a callable measure for computing the false positive count. Aliases: `false_positive`

, `falsepositive`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `FalsePositive`

constructor (e.g., `m = FalsePositive()`

) on predictions `ŷ`

, given ground truth observations `y`

. When ordering classes (levels) on the basis of the eltype of `y`

, the *second* level is the "positive" class. To reverse roles, specify `rev=true`

.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. See `ConfusionMatrix`

.

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

See also `MulticlassFalsePositive`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.false_positive`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = false positive count
```

`StatisticalMeasures.FalseNegative`

— Function`FalseNegative(; levels=nothing, rev=nothing, checks=true)`

Return a callable measure for computing the false negative count. Aliases: `false_negative`

, `falsenegative`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `FalseNegative`

constructor (e.g., `m = FalseNegative()`

) on predictions `ŷ`

, given ground truth observations `y`

. When ordering classes (levels) on the basis of the eltype of `y`

, the *second* level is the "positive" class. To reverse roles, specify `rev=true`

.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. See `ConfusionMatrix`

.

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

See also `MulticlassFalseNegative`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.false_negative`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = false negative count
```

`StatisticalMeasures.TruePositiveRate`

— Function`TruePositiveRate(; levels=nothing, rev=nothing, checks=true)`

Return a callable measure for computing the true positive rate. Aliases: `true_positive_rate`

, `truepositive_rate`

, `tpr`

, `sensitivity`

, `recall`

, `hit_rate`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `TruePositiveRate`

constructor (e.g., `m = TruePositiveRate()`

) on predictions `ŷ`

, given ground truth observations `y`

. When ordering classes (levels) on the basis of the eltype of `y`

, the *second* level is the "positive" class. To reverse roles, specify `rev=true`

.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. See `ConfusionMatrix`

.

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

See also `MulticlassTruePositiveRate`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.true_positive_rate`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = true positive rate
```

`StatisticalMeasures.TrueNegativeRate`

— Function`TrueNegativeRate(; levels=nothing, rev=nothing, checks=true)`

Return a callable measure for computing the true negative rate. Aliases: `true_negative_rate`

, `truenegative_rate`

, `tnr`

, `specificity`

, `selectivity`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `TrueNegativeRate`

constructor (e.g., `m = TrueNegativeRate()`

) on predictions `ŷ`

, given ground truth observations `y`

. When ordering classes (levels) on the basis of the eltype of `y`

, the *second* level is the "positive" class. To reverse roles, specify `rev=true`

.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. See `ConfusionMatrix`

.

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

See also `MulticlassTrueNegativeRate`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.true_negative_rate`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = true negative rate
```

`StatisticalMeasures.FalsePositiveRate`

— Function`FalsePositiveRate(; levels=nothing, rev=nothing, checks=true)`

Return a callable measure for computing the false positive rate. Aliases: `false_positive_rate`

, `falsepositive_rate`

, `fpr`

, `fallout`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `FalsePositiveRate`

constructor (e.g., `m = FalsePositiveRate()`

) on predictions `ŷ`

, given ground truth observations `y`

. When ordering classes (levels) on the basis of the eltype of `y`

, the *second* level is the "positive" class. To reverse roles, specify `rev=true`

.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. See `ConfusionMatrix`

.

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

See also `MulticlassFalsePositiveRate`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.false_positive_rate`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = false positive rate
```

`StatisticalMeasures.FalseNegativeRate`

— Function`FalseNegativeRate(; levels=nothing, rev=nothing, checks=true)`

Return a callable measure for computing the false negative rate. Aliases: `false_negative_rate`

, `falsenegative_rate`

, `fnr`

, `miss_rate`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `FalseNegativeRate`

constructor (e.g., `m = FalseNegativeRate()`

) on predictions `ŷ`

, given ground truth observations `y`

. When ordering classes (levels) on the basis of the eltype of `y`

, the *second* level is the "positive" class. To reverse roles, specify `rev=true`

.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. See `ConfusionMatrix`

.

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

See also `MulticlassFalseNegativeRate`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.false_negative_rate`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = false negative rate
```

`StatisticalMeasures.FalseDiscoveryRate`

— Function`FalseDiscoveryRate(; levels=nothing, rev=nothing, checks=true)`

Return a callable measure for computing the false discovery rate. Aliases: `false_discovery_rate`

, `falsediscovery_rate`

, `fdr`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `FalseDiscoveryRate`

constructor (e.g., `m = FalseDiscoveryRate()`

) on predictions `ŷ`

, given ground truth observations `y`

. When ordering classes (levels) on the basis of the eltype of `y`

, the *second* level is the "positive" class. To reverse roles, specify `rev=true`

.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. See `ConfusionMatrix`

.

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

See also `MulticlassFalseDiscoveryRate`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.false_discovery_rate`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = false discovery rate
```

`StatisticalMeasures.PositivePredictiveValue`

— Function`PositivePredictiveValue(; levels=nothing, rev=nothing, checks=true)`

Return a callable measure for computing the positive predictive value. Aliases: `positive_predictive_value`

, `ppv`

, `positivepredictive_value`

, `precision`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `PositivePredictiveValue`

constructor (e.g., `m = PositivePredictiveValue()`

) on predictions `ŷ`

, given ground truth observations `y`

. When ordering classes (levels) on the basis of the eltype of `y`

, the *second* level is the "positive" class. To reverse roles, specify `rev=true`

.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. See `ConfusionMatrix`

.

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

See also `MulticlassPositivePredictiveValue`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.positive_predictive_value`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = positive predictive value
```

`StatisticalMeasures.NegativePredictiveValue`

— Function`NegativePredictiveValue(; levels=nothing, rev=nothing, checks=true)`

Return a callable measure for computing the negative predictive value. Aliases: `negative_predictive_value`

, `negativepredictive_value`

, `npv`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `NegativePredictiveValue`

constructor (e.g., `m = NegativePredictiveValue()`

) on predictions `ŷ`

, given ground truth observations `y`

. When ordering classes (levels) on the basis of the eltype of `y`

, the *second* level is the "positive" class. To reverse roles, specify `rev=true`

.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. See `ConfusionMatrix`

.

**Keyword options**

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

See also `MulticlassNegativePredictiveValue`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.negative_predictive_value`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.OrderedFactor{2}}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = negative predictive value
```

`StatisticalMeasures.MulticlassTruePositive`

— Function`MulticlassTruePositive(; levels=nothing, more_options...)`

Return a callable measure for computing the multi-class true positive count. Aliases: `multiclass_true_positive`

, `multiclass_truepositive`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `MulticlassTruePositive`

constructor (e.g., `m = MulticlassTruePositive()`

) on predictions `ŷ`

, given ground truth observations `y`

.

This is a one-versus-rest version of the binary measure `TruePositive`

, returning a dictionary keyed on target class (level), or a vector (see options below), instead of a single number, even on binary data.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. Construct confusion matrices using `ConfusionMatrix`

.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

**Keyword options**

`return_type=LittleDict`

: type of returned measurement for`average=NoAvg()`

case; if`LittleDict`

, then keyed on levels of the target; can also be`Vector`

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

See also `TruePositive`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.multiclass_true_positive`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = multi-class true positive count
```

`StatisticalMeasures.MulticlassTrueNegative`

— Function`MulticlassTrueNegative(; levels=nothing, more_options...)`

Return a callable measure for computing the multi-class true negative count. Aliases: `multiclass_true_negative`

, `multiclass_truenegative`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `MulticlassTrueNegative`

constructor (e.g., `m = MulticlassTrueNegative()`

) on predictions `ŷ`

, given ground truth observations `y`

.

This is a one-versus-rest version of the binary measure `TrueNegative`

, returning a dictionary keyed on target class (level), or a vector (see options below), instead of a single number, even on binary data.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. Construct confusion matrices using `ConfusionMatrix`

.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

**Keyword options**

`return_type=LittleDict`

: type of returned measurement for`average=NoAvg()`

case; if`LittleDict`

, then keyed on levels of the target; can also be`Vector`

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

See also `TrueNegative`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.multiclass_true_negative`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = multi-class true negative count
```

`StatisticalMeasures.MulticlassFalsePositive`

— Function`MulticlassFalsePositive(; levels=nothing, more_options...)`

Return a callable measure for computing the multi-class false positive count. Aliases: `multiclass_false_positive`

, `multiclass_falsepositive`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `MulticlassFalsePositive`

constructor (e.g., `m = MulticlassFalsePositive()`

) on predictions `ŷ`

, given ground truth observations `y`

.

This is a one-versus-rest version of the binary measure `FalsePositive`

, returning a dictionary keyed on target class (level), or a vector (see options below), instead of a single number, even on binary data.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. Construct confusion matrices using `ConfusionMatrix`

.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

**Keyword options**

`return_type=LittleDict`

: type of returned measurement for`average=NoAvg()`

case; if`LittleDict`

, then keyed on levels of the target; can also be`Vector`

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

See also `FalsePositive`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.multiclass_false_positive`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = multi-class false positive count
```

`StatisticalMeasures.MulticlassFalseNegative`

— Function`MulticlassFalseNegative(; levels=nothing, more_options...)`

Return a callable measure for computing the multi-class false negative count. Aliases: `multiclass_false_negative`

, `multiclass_falsenegative`

.

`m(ŷ, y)`

Evaluate some measure `m`

returned by the `MulticlassFalseNegative`

constructor (e.g., `m = MulticlassFalseNegative()`

) on predictions `ŷ`

, given ground truth observations `y`

.

This is a one-versus-rest version of the binary measure `FalseNegative`

, returning a dictionary keyed on target class (level), or a vector (see options below), instead of a single number, even on binary data.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

`m`

can also be called on a confusion matrix. Construct confusion matrices using `ConfusionMatrix`

.

`obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

**Keyword options**

`return_type=LittleDict`

: type of returned measurement for`average=NoAvg()`

case; if`LittleDict`

, then keyed on levels of the target; can also be`Vector`

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

See also `FalseNegative`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.multiclass_false_negative`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Sum()
human_name = multi-class false negative count
```

`StatisticalMeasures.MulticlassTruePositiveRate`

— Function`MulticlassTruePositiveRate(; average=macro_avg, levels=nothing, more_options...)`

Return a callable measure for computing the multi-class true positive rate. Aliases: `multiclass_true_positive_rate`

, `multiclass_truepositive_rate`

, `multiclass_tpr`

, `multiclass_sensitivity`

, `multiclass_recall`

, `multiclass_hit_rate`

.

```
m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MulticlassTruePositiveRate`

constructor (e.g., `m = MulticlassTruePositiveRate()`

) on predictions `ŷ`

, given ground truth observations `y`

.

This is an averaged one-versus-rest version of the binary `TruePositiveRate`

. Or it can return a dictionary keyed on target class (or a vector); see `average`

options below.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

You can also call `m`

on confusion matrices. Construct confusion matrices using `ConfusionMatrix`

.

The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

**Keyword options**

`average=MacroAvg()`

: one of:`NoAvg()`

,`MacroAvg()`

,`MicroAvg()`

(names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1",*arXiv*.

`return_type=LittleDict`

: type of returned measurement for`average=NoAvg()`

case; if`LittleDict`

, then keyed on levels of the target; can also be`Vector`

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

See also `TruePositiveRate`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.multiclass_true_positive_rate`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class true positive rate
```

`StatisticalMeasures.MulticlassTrueNegativeRate`

— Function`MulticlassTrueNegativeRate(; average=macro_avg, levels=nothing, more_options...)`

Return a callable measure for computing the multi-class true negative rate. Aliases: `multiclass_true_negative_rate`

, `multiclass_truenegative_rate`

, `multiclass_tnr`

, `multiclass_specificity`

, `multiclass_selectivity`

.

```
m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MulticlassTrueNegativeRate`

constructor (e.g., `m = MulticlassTrueNegativeRate()`

) on predictions `ŷ`

, given ground truth observations `y`

.

This is an averaged one-versus-rest version of the binary `TrueNegativeRate`

. Or it can return a dictionary keyed on target class (or a vector); see `average`

options below.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

You can also call `m`

on confusion matrices. Construct confusion matrices using `ConfusionMatrix`

.

The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

**Keyword options**

`average=MacroAvg()`

: one of:`NoAvg()`

,`MacroAvg()`

,`MicroAvg()`

(names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1",*arXiv*.

`return_type=LittleDict`

: type of returned measurement for`average=NoAvg()`

case; if`LittleDict`

, then keyed on levels of the target; can also be`Vector`

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

See also `TrueNegativeRate`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.multiclass_true_negative_rate`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class true negative rate
```

`StatisticalMeasures.MulticlassFalsePositiveRate`

— Function`MulticlassFalsePositiveRate(; average=macro_avg, levels=nothing, more_options...)`

Return a callable measure for computing the multi-class false positive rate. Aliases: `multiclass_false_positive_rate`

, `multiclass_falsepositive_rate`

, `multiclass_fpr`

, `multiclass_fallout`

.

```
m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MulticlassFalsePositiveRate`

constructor (e.g., `m = MulticlassFalsePositiveRate()`

) on predictions `ŷ`

, given ground truth observations `y`

.

This is an averaged one-versus-rest version of the binary `FalsePositiveRate`

. Or it can return a dictionary keyed on target class (or a vector); see `average`

options below.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

You can also call `m`

on confusion matrices. Construct confusion matrices using `ConfusionMatrix`

.

The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

**Keyword options**

`average=MacroAvg()`

: one of:`NoAvg()`

,`MacroAvg()`

,`MicroAvg()`

(names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1",*arXiv*.

`return_type=LittleDict`

: type of returned measurement for`average=NoAvg()`

case; if`LittleDict`

, then keyed on levels of the target; can also be`Vector`

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

See also `FalsePositiveRate`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.multiclass_false_positive_rate`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class false positive rate
```

`StatisticalMeasures.MulticlassFalseNegativeRate`

— Function`MulticlassFalseNegativeRate(; average=macro_avg, levels=nothing, more_options...)`

Return a callable measure for computing the multi-class false negative rate. Aliases: `multiclass_false_negative_rate`

, `multiclass_falsenegative_rate`

, `multiclass_fnr`

, `multiclass_miss_rate`

.

```
m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MulticlassFalseNegativeRate`

constructor (e.g., `m = MulticlassFalseNegativeRate()`

) on predictions `ŷ`

, given ground truth observations `y`

.

This is an averaged one-versus-rest version of the binary `FalseNegativeRate`

. Or it can return a dictionary keyed on target class (or a vector); see `average`

options below.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

You can also call `m`

on confusion matrices. Construct confusion matrices using `ConfusionMatrix`

.

`class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

**Keyword options**

`average=MacroAvg()`

: one of:`NoAvg()`

,`MacroAvg()`

,`MicroAvg()`

(names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1",*arXiv*.

`return_type=LittleDict`

: type of returned measurement for`average=NoAvg()`

case; if`LittleDict`

, then keyed on levels of the target; can also be`Vector`

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

See also `FalseNegativeRate`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.multiclass_false_negative_rate`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class false negative rate
```

`StatisticalMeasures.MulticlassFalseDiscoveryRate`

— Function`MulticlassFalseDiscoveryRate(; average=macro_avg, levels=nothing, more_options...)`

Return a callable measure for computing the multi-class false discovery rate. Aliases: `multiclass_false_discovery_rate`

, `multiclass_falsediscovery_rate`

, `multiclass_fdr`

.

```
m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MulticlassFalseDiscoveryRate`

constructor (e.g., `m = MulticlassFalseDiscoveryRate()`

) on predictions `ŷ`

, given ground truth observations `y`

.

This is an averaged one-versus-rest version of the binary `FalseDiscoveryRate`

. Or it can return a dictionary keyed on target class (or a vector); see `average`

options below.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

You can also call `m`

on confusion matrices. Construct confusion matrices using `ConfusionMatrix`

.

`class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

**Keyword options**

`average=MacroAvg()`

: one of:`NoAvg()`

,`MacroAvg()`

,`MicroAvg()`

(names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1",*arXiv*.

`return_type=LittleDict`

: type of returned measurement for`average=NoAvg()`

case; if`LittleDict`

, then keyed on levels of the target; can also be`Vector`

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

See also `FalseDiscoveryRate`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.multiclass_false_discovery_rate`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class false discovery rate
```

`StatisticalMeasures.MulticlassPositivePredictiveValue`

— Function`MulticlassPositivePredictiveValue(; average=macro_avg, levels=nothing, more_options...)`

Return a callable measure for computing the multi-class positive predictive value. Aliases: `multiclass_positive_predictive_value`

, `multiclass_ppv`

, `multiclass_positivepredictive_value`

, `multiclass_precision`

.

```
m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MulticlassPositivePredictiveValue`

constructor (e.g., `m = MulticlassPositivePredictiveValue()`

) on predictions `ŷ`

, given ground truth observations `y`

.

This is an averaged one-versus-rest version of the binary `PositivePredictiveValue`

. Or it can return a dictionary keyed on target class (or a vector); see `average`

options below.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

You can also call `m`

on confusion matrices. Construct confusion matrices using `ConfusionMatrix`

.

`class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

**Keyword options**

`average=MacroAvg()`

: one of:`NoAvg()`

,`MacroAvg()`

,`MicroAvg()`

(names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1",*arXiv*.

`return_type=LittleDict`

: type of returned measurement for`average=NoAvg()`

case; if`LittleDict`

, then keyed on levels of the target; can also be`Vector`

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

See also `PositivePredictiveValue`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.multiclass_positive_predictive_value`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class positive predictive value
```

`StatisticalMeasures.MulticlassNegativePredictiveValue`

— Function`MulticlassNegativePredictiveValue(; average=macro_avg, levels=nothing, more_options...)`

Return a callable measure for computing the multi-class negative predictive value. Aliases: `multiclass_negative_predictive_value`

, `multiclass_negativepredictive_value`

, `multiclass_npv`

.

```
m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MulticlassNegativePredictiveValue`

constructor (e.g., `m = MulticlassNegativePredictiveValue()`

) on predictions `ŷ`

, given ground truth observations `y`

.

This is an averaged one-versus-rest version of the binary `NegativePredictiveValue`

. Or it can return a dictionary keyed on target class (or a vector); see `average`

options below.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

You can also call `m`

on confusion matrices. Construct confusion matrices using `ConfusionMatrix`

.

`class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

**Keyword options**

`average=MacroAvg()`

: one of:`NoAvg()`

,`MacroAvg()`

,`MicroAvg()`

(names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1",*arXiv*.

`return_type=LittleDict`

: type of returned measurement for`average=NoAvg()`

case; if`LittleDict`

, then keyed on levels of the target; can also be`Vector`

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

See also `NegativePredictiveValue`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.multiclass_negative_predictive_value`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class negative predictive value
```

`StatisticalMeasures.MulticlassFScore`

— Function`MulticlassFScore(; average=macro_avg, levels=nothing, more_options...)`

Return a callable measure for computing the multi-class $F_β$ score. Aliases: `macro_f1score`

, `micro_f1score`

, `multiclass_f1score`

.

```
m(ŷ, y)
m(ŷ, y, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `MulticlassFScore`

constructor (e.g., `m = MulticlassFScore()`

) on predictions `ŷ`

, given ground truth observations `y`

.

This is an averaged one-versus-rest version of the binary `FScore`

. Or it can return a dictionary keyed on target class (or a vector); see `average`

options below.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

You can also call `m`

on confusion matrices. Construct confusion matrices using `ConfusionMatrix`

.

`class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{OrderedFactor{2},Missing}`

(binary classification where definition of "positive" class matters).

**Keyword options**

`beta=1.0`

: parameter in the range $[0,∞]$, emphasizing recall over precision for`beta > 1`

, except in the case`average=MicroAvg()`

, when it has no effect.

`average=MacroAvg()`

: one of:`NoAvg()`

,`MacroAvg()`

,`MicroAvg()`

(names owned and exported by StatisticalMeasuresBase.jl.) See J. Opitz and S. Burst (2019). "Macro F1 and Macro F1",*arXiv*.

`return_type=LittleDict`

: type of returned measurement for`average=NoAvg()`

case; if`LittleDict`

, then keyed on levels of the target; can also be`Vector`

`levels::Union{Vector,Nothing}=nothing`

: if`nothing`

, levels are inferred from`ŷ`

and`y`

and, by default, ordered according to the element type of`y`

.`rev=false`

: in the case of binary data, whether to reverse the`levels`

(as inferred or specified); a`nothing`

value is the same as`false`

.

`perm=nothing`

: in the general case, a permutation representing a re-ordering of`levels`

(as inferred or specified); e.g.,`perm = [1,3,2]`

for data with three classes.

`checks=true`

: when true, specified`levels`

are checked to see they include all observed levels; set to`false`

for speed.

`CategoricalArray`

inputs with `levels`

inferred. In that case `levels`

will be the complete internal class pool, and not just the observed levels.

See also `FScore`

, `StatisticalMeasures.ConfusionMatrices.ConfusionMatrix`

and `ConfusionMatrix`

.

Core algorithm: `Functions.multiclass_fscore`

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.LiteralTarget()
observation_scitype = Union{Missing, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = false
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = multi-class ``F_β`` score
```

`StatisticalMeasures.AreaUnderCurve`

— Function`AreaUnderCurve()`

Return a callable measure for computing the area under the receiver operator characteritic. Aliases: `auc`

, `area_under_curve`

.

`AreaUnderCurve()(ŷ, y)`

Evaluate `AreaUnderCurve()`

on predictions `ŷ`

, given ground truth observations `y`

. See the *Recevier operator chararacteristic* (ROC) Wikipedia article for a definition. It is expected that `ŷ`

be a vector of distributions over the binary set of unique elements of `y`

; specifically, `ŷ`

should have eltype `<:UnivariateFinite`

from the CategoricalDistributions.jl package.

Implementation is based on the Mann-Whitney U statistic. See the *Whitney U test* Wikipedia page for details.

Core implementation: `Functions.auc`

.

This metric is invariant to class reordering.

Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`ScientificTypesBase.Binary`

.

See also `roc_curve`

.

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = ScientificTypesBase.Binary
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = area under the receiver operator characteritic
```

`StatisticalMeasures.LogScore`

— Function`LogScore(; tol=eps())`

Return a callable measure for computing the log score. Aliases: `log_score`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `LogScore`

constructor (e.g., `m = LogScore()`

) on predictions `ŷ`

, given ground truth observations `y`

. The score is a mean of observational scores. More generally, observational scores are pre-multiplied by the specified weights before averaging. See below for the form that probabilistic predictions `ŷ`

should take. Raw probabilities are clamped away from `0`

and `1`

. Specifically, if `p`

is the probability mass/density function evaluated at given observed ground truth observation `η`

, then the score for that example is defined as

`log(clamp(p(η), tol, 1 - tol).`

For example, for a binary target with "yes"/"no" labels, if the probabilistic prediction scores 0.8 for a "yes", then for a corresponding ground truth observation of "no", that example's contribution to the score is `log(0.2)`

.

The predictions `ŷ`

should be a vector of `UnivariateFinite`

distributions from CategoricalDistritutions.jl, in the case of `Finite`

target `y`

(a `CategoricalVector`

) and should otherwise be a supported `Distributions.UnivariateDistribution`

such as `Normal`

or `Poisson`

.

See also `LogLoss`

, which differs only in sign.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Missing,T}`

where `T`

is `Continuous`

or `Count`

(for respectively continuous or discrete Distribution.jl objects in `ŷ`

) or `OrderedFactor`

or `Multiclass`

(for `UnivariateFinite`

distributions in `ŷ`

).

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = log score
```

`StatisticalMeasures.LogLoss`

— Function`LogLoss(; tol=eps())`

Return a callable measure for computing the log loss. Aliases: `log_loss`

, `cross_entropy`

.

```
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate some measure `m`

returned by the `LogLoss`

constructor (e.g., `m = LogLoss()`

) on predictions `ŷ`

, given ground truth observations `y`

. For details, see `LogScore`

, which differs only by a sign.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(m, ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Missing,T}`

where `T`

is `Continuous`

or `Count`

(for respectively continuous or discrete Distribution.jl objects in `ŷ`

) or `OrderedFactor`

or `Multiclass`

(for `UnivariateFinite`

distributions in `ŷ`

).

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = log loss
```

`StatisticalMeasures.BrierScore`

— Function`BrierScore()`

Return a callable measure for computing the brier score. Aliases: `brier_score`

, `quadratic_score`

.

```
BrierScore()(ŷ, y)
BrierScore()(ŷ, y, weights)
BrierScore()(ŷ, y, class_weights::AbstractDict)
BrierScore()(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate `BrierScore()`

on predictions `ŷ`

, given ground truth observations `y`

. The score is a mean of observational scores. More generally, observational scores are pre-multiplied by the specified weights before averaging. See below for the form that probabilistic predictions `ŷ`

should take.

Convention as in Gneiting and Raftery (2007), "StrictlyProper Scoring Rules, Prediction, and Estimation"

*Finite case.* If `p(η)`

is the predicted probability for a *single* observation `η`

, and `C`

all possible classes, then the corresponding score for that example is given by

$2p(η) - \left(\sum_{c ∈ C} p(c)^2\right) - 1$

*Warning.* `BrierScore()`

is a "score" in the sense that bigger is better (with `0`

optimal, and all other values negative). In Brier's original 1950 paper, and many other places, it has the opposite sign, despite the name. Moreover, the present implementation does not treat the binary case as special, so that the score may differ in the binary case by a factor of two from usage elsewhere.

*Infinite case.* Replacing the sum above with an integral does *not* lead to the formula adopted here in the case of `Continuous`

or `Count`

target `y`

. Rather the convention in the paper cited above is adopted, which means returning a score of

$2p(η) - ∫ p(t)^2 dt$

in the `Continuous`

case (`p`

the probablity density function) or

$2p(η) - ∑_t p(t)^2$

in the `Count`

case (`p`

the probablity mass function).

The predictions `ŷ`

should be a vector of `UnivariateFinite`

distributions from CategoricalDistritutions.jl, in the case of `Finite`

target `y`

(a `CategoricalVector`

) and should otherwise be a supported `Distributions.UnivariateDistribution`

such as `Normal`

or `Poisson`

.

See also `BrierLoss`

, which differs only in sign.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(BrierScore(), ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Missing,T}`

where `T`

is `Continuous`

or `Count`

(for respectively continuous or discrete Distribution.jl objects in `ŷ`

) or `OrderedFactor`

or `Multiclass`

(for `UnivariateFinite`

distributions in `ŷ`

).

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = brier score
```

`StatisticalMeasures.BrierLoss`

— Function`BrierLoss()`

Return a callable measure for computing the brier loss. Aliases: `brier_loss`

, `cross_entropy`

, `quadratic_loss`

.

```
BrierLoss()(ŷ, y)
BrierLoss()(ŷ, y, weights)
BrierLoss()(ŷ, y, class_weights::AbstractDict)
BrierLoss()(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate `BrierLoss()`

on predictions `ŷ`

, given ground truth observations `y`

. For details, see `BrierScore`

, which differs only by a sign.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(BrierLoss(), ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Missing,T}`

where `T`

is `Continuous`

or `Count`

(for respectively continuous or discrete Distribution.jl objects in `ŷ`

) or `OrderedFactor`

or `Multiclass`

(for `UnivariateFinite`

distributions in `ŷ`

).

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Loss()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = brier loss
```

`StatisticalMeasures.SphericalScore`

— Function`SphericalScore()`

Return a callable measure for computing the spherical score. Aliases: `spherical_score`

.

```
SphericalScore()(ŷ, y)
SphericalScore()(ŷ, y, weights)
SphericalScore()(ŷ, y, class_weights::AbstractDict)
SphericalScore()(ŷ, y, weights, class_weights::AbstractDict)
```

Evaluate `SphericalScore()`

on predictions `ŷ`

, given ground truth observations `y`

. The score is a mean of observational scores. More generally, observational scores are pre-multiplied by the specified weights before averaging. See below for the form that probabilistic predictions `ŷ`

should take.

Convention as in Gneiting and Raftery (2007), "StrictlyProper Scoring Rules, Prediction, and Estimation": If `y`

takes on a finite number of classes `C`

and `p(y)`

is the predicted probability for a single observation `y`

, then the corresponding score for that example is given by

$p(y)^α / \left(\sum_{η ∈ C} p(η)^α\right)^{1-α} - 1$

where `α`

is the measure parameter `alpha`

.

In the case the predictions `ŷ`

are continuous probability distributions, such as `Distributions.Normal`

, replace the above sum with an integral, and interpret `p`

as the probablity density function. In case of discrete distributions over the integers, such as `Distributions.Poisson`

, sum over all integers instead of `C`

.

`length`

generating `Real`

elements can be used for `weights`

. The keys of `class_weights`

should include all conceivable values for observations in `y`

, and values should be `Real`

.

Measurements are aggregated. To obtain a separate measurement for each observation, use the syntax `measurements(SphericalScore(), ŷ, y)`

. Generally, an observation `obs`

in `MLUtils.eachobs(y)`

is expected to satisfy `ScientificTypes.scitype(obs)<:`

`Union{Missing,T}`

where `T`

is `Continuous`

or `Count`

(for respectively continuous or discrete Distribution.jl objects in `ŷ`

) or `OrderedFactor`

or `Multiclass`

(for `UnivariateFinite`

distributions in `ŷ`

).

For a complete dictionary of available measures, keyed on constructor, run `measures()`

.

**Traits**

```
consumes_multiple_observations = true
can_report_unaggregated = true
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = Union{Missing, ScientificTypesBase.Infinite, ScientificTypesBase.Finite}
can_consume_tables = false
supports_weights = true
supports_class_weights = true
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = spherical score
```