Examples of Usage

Calling syntax

A measure m is called with this syntax:

m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights)

where y is ground truth and predictions. This package provides measure constructors, such as BalancedAccuracy:

using StatisticalMeasures
using StatisticalMeasures

m = BalancedAccuracy(adjusted=true)
m(["O", "X", "O", "X"], ["X", "X", "X", "O"], [1, 2, 1, 2])
-0.5

Aliases are provided for commonly applied instances:

bacc == BalancedAccuracy() == BalancedAccuracy(adjusted=false)
true

Contents

Binary classification

using StatisticalMeasures
using CategoricalArrays

# ground truth:
y = categorical(
        ["X", "X", "X", "O", "X", "X", "O", "O", "X"],
        ordered=true,
)

# prediction:
ŷ = categorical(
   ["O", "X", "O", "X", "O", "O", "O", "X", "X"],
   levels=levels(y),
   ordered=true,
)

accuracy(ŷ, y)
0.3333333333333333
weights = [1, 2, 1, 2, 1, 2, 1, 2, 1]
accuracy(ŷ, y, weights)
0.4444444444444444
class_weights = Dict("X" => 10, "O" => 1)
accuracy(ŷ, y, class_weights)
2.3333333333333335
accuracy(ŷ, y, weights, class_weights)
3.4444444444444446

To get a measurement for each individual observation, use measurements:

measurements(accuracy, ŷ, y, weights, class_weights)
9-element Vector{Int64}:
  0
 20
  0
  0
  0
  0
  1
  0
 10
kappa(ŷ, y)
-0.28571428571428564
mat = confmat(ŷ, y)
          ┌─────────────┐
          │Ground Truth │
┌─────────┼──────┬──────┤
│Predicted│  O   │  X   │
├─────────┼──────┼──────┤
│    O    │  1   │  4   │
├─────────┼──────┼──────┤
│    X    │  2   │  2   │
└─────────┴──────┴──────┘

Some measures can be applied directly to confusion matrices:

kappa(mat)
-0.28571428571428564

Multi-class classification

using StatisticalMeasures
using CategoricalArrays
import Random
Random.seed!()

y = rand("ABC", 1000) |> categorical
ŷ = rand("ABC", 1000) |> categorical
class_weights = Dict('A' => 1, 'B' =>2, 'C' => 10)
MulticlassFScore(beta=0.5, average=MacroAvg())(ŷ, y,  class_weights)
1.3275622571655565
MulticlassFScore(beta=0.5, average=NoAvg())(ŷ, y,  class_weights)
LittleDict{CategoricalArrays.CategoricalValue{Char, UInt32}, Float64, Tuple{CategoricalArrays.CategoricalValue{Char, UInt32}, CategoricalArrays.CategoricalValue{Char, UInt32}, CategoricalArrays.CategoricalValue{Char, UInt32}}, Tuple{Float64, Float64, Float64}} with 3 entries:
  'A' => 0.322289
  'B' => 0.596745
  'C' => 3.06365

Unseen classes are tracked, when using CategoricalArrays, as here:

# find 'C'-free indices
mask = y .!= 'C' .&& ŷ .!= 'C';
# remove observations with 'C' class::
y = y[mask]
ŷ = ŷ[mask]
'C' in y ∪ ŷ
false
confmat(ŷ, y)
          ┌──────────────┐
          │ Ground Truth │
┌─────────┼────┬────┬────┤
│Predicted│ A  │ B  │ C  │
├─────────┼────┼────┼────┤
│    A    │107 │112 │ 0  │
├─────────┼────┼────┼────┤
│    B    │112 │ 99 │ 0  │
├─────────┼────┼────┼────┤
│    C    │ 0  │ 0  │ 0  │
└─────────┴────┴────┴────┘

Probabilistic classification

To mitigate ambiguity around representations of predicted probabilities, a probabilistic prediction of categorical data is expected to be represented by a UnivariateFinite distribution, from the package CategoricalDistributions.jl. This is the form delivered, for example, by MLJ classification models.

using StatisticalMeasures
using CategoricalArrays
using CategoricalDistributions

y = categorical(["X", "O", "X", "X", "O", "X", "X", "O", "O", "X"], ordered=true)
X_probs = [0.3, 0.2, 0.4, 0.9, 0.1, 0.4, 0.5, 0.2, 0.8, 0.7]
ŷ = UnivariateFinite(["O", "X"], X_probs, augment=true, pool=y)
ŷ[1]
UnivariateFinite{OrderedFactor{2}}(O=>0.7, X=>0.3)
auc(ŷ, y)
0.7916666666666666
measurements(log_loss, ŷ, y)
10-element Vector{Float64}:
 1.2039728043259361
 0.2231435513142097
 0.916290731874155
 0.10536051565782628
 0.10536051565782628
 0.916290731874155
 0.6931471805599453
 0.2231435513142097
 1.6094379124341005
 0.35667494393873245
measurements(brier_score, ŷ, y)
10-element Vector{Float64}:
 -0.9800000000000001
 -0.08000000000000007
 -0.72
 -0.020000000000000018
 -0.020000000000000018
 -0.72
 -0.5
 -0.08000000000000007
 -1.2800000000000002
 -0.18000000000000016

We note in passing that mode and pdf methods can be applied to UnivariateFinite distributions. So, for example, we can do:

confmat(mode.(ŷ), y)
          ┌─────────────┐
          │Ground Truth │
┌─────────┼──────┬──────┤
│Predicted│  O   │  X   │
├─────────┼──────┼──────┤
│    O    │  3   │  4   │
├─────────┼──────┼──────┤
│    X    │  1   │  2   │
└─────────┴──────┴──────┘

Non-probabilistic regression

using StatisticalMeasures

y = [0.1, -0.2, missing, 0.7]
ŷ = [-0.2, 0.1, 0.4, 0.7]
rsquared(ŷ, y)
0.5789473684210524
weights = [1, 3, 2, 5]
rms(ŷ, y, weights)
0.30000000000000004
measurements(LPLoss(p=2.5), ŷ, y, weights)
4-element Vector{Union{Missing, Float64}}:
 0.049295030175464966
 0.1478850905263949
  missing
 0.0

Here's an example of a multi-target regression measure, for data with 3 observations of a 2-component target:

# last index is observation index:
y = [1 2 3; 2 4 6]
ŷ = [2 3 4; 4 6 8]
weights = [8, 7, 6]
ŷ - y
2×3 Matrix{Int64}:
 1  1  1
 2  2  2
MultitargetLPLoss(p=2.5)(ŷ, y, weights)
23.29898987322333
# one "atomic weight" per component of target:
MultitargetLPLoss(p=2.5, atomic_weights = [1, 10])(ŷ, y, weights)
201.4898987322333

Some tabular formats (e.g., DataFrame) are also supported:

using Tables
t = y' |> Tables.table |> Tables.rowtable
t̂ = ŷ' |> Tables.table |> Tables.rowtable
MultitargetLPLoss(p=2.5)(ŷ, y, weights)
23.29898987322333

Probabilistic regression

using StatisticalMeasures
import Distributions:Poisson, Normal
import Random.seed!
seed!()

y = rand(20)
ŷ = [Normal(rand(), 0.5) for i in 1:20]
ŷ[1]
Distributions.Normal{Float64}(μ=0.5015431521573479, σ=0.5)
log_loss(ŷ, y)
0.4504182739554906
weights = rand(20)
log_loss(ŷ, y, weights)
0.25880837386663325
weights = rand(20)
measurements(log_loss, ŷ, y, weights)
20-element Vector{Float64}:
 0.2917898365240672
 0.3664724439639467
 0.5909333010607659
 0.1985005077072399
 0.022487809070239242
 0.19227222448718267
 0.011362539794069986
 0.027067922361291977
 0.9983979779165483
 0.10821003183823857
 0.03394455270607971
 0.10795288969219827
 0.2188768895455741
 0.2650340179368862
 0.08330620923654473
 0.06133103675865898
 0.17247038976874973
 0.14910575372820814
 0.1837909919722805
 0.7786967749698127

An example with Count (integer) data:

y = rand(1:10, 20)
ŷ = [Poisson(10*rand()) for i in 1:20]
ŷ[1]
Distributions.Poisson{Float64}(λ=6.471248879459091)
brier_loss(ŷ, y)
0.0036111113539912033

Custom multi-target measures

Here's an example of constructing a multi-target regression measure, for data with 3 observations of a 2-component target:

using StatisticalMeasures

# last index is observation index:
y = ["X" "O" "O"; "O" "X" "X"]
ŷ = ["O" "X" "O"; "O" "O" "O"]
2×3 Matrix{String}:
 "O"  "X"  "O"
 "O"  "O"  "O"
# if prescribed, we need one "atomic weight" per component of target:
multitarget_accuracy= multimeasure(accuracy, atomic_weights=[1, 2])
multitarget_accuracy(ŷ, y)
0.5
measurements(multitarget_accuracy, ŷ, y)
3-element Vector{Float64}:
 1.0
 0.0
 0.5
# one weight per observation:
weights = [1, 2, 10]
measurements(multitarget_accuracy, ŷ, y, weights)
3-element Vector{Float64}:
 1.0
 0.0
 5.0

See multimeasure for options. Refer to the StatisticalMeausureBase.jl documentation for advanced measure customization.

Using losses from LossFunctions.jl

The margin losses in LossFunctions.jl can be regarded as binary probabilistic measures, but they cannot be directly called on CategoricalValues and UnivariateFinite distributions, as we do for similar measures provided by StatisticalMeasures (see Probabilistic classification above). If we want this latter behavior, then we need to wrap these losses using Measure:

using StatisticalMeasures
import LossFunctions as LF

loss = Measure(LF.L1HingeLoss())
Measure(LossFunctions.L1HingeLoss())

This loss can only be called on scalars (true for LossFunctions.jl losses since v0.10):

using CategoricalArrays
using CategoricalDistributions

y = categorical(["X", "O", "X", "X"], ordered=true)
X_probs = [0.3, 0.2, 0.4, 0.9]
ŷ = UnivariateFinite(["O", "X"], X_probs, augment=true, pool=y)
loss(ŷ[1], y[1])
1.4

This is remedied with the multimeasure wrapper:

import StatisticalMeasuresBase.Sum

loss_on_vectors = multimeasure(loss, mode=Sum())
loss_on_vectors(ŷ, y)
0.8
class_weights = Dict("X"=>1, "O"=>10)
loss_on_vectors(ŷ, y, class_weights)
1.6999999999999997
measurements(loss_on_vectors, ŷ, y)
4-element Vector{Float64}:
 1.4
 0.3999999999999999
 1.2
 0.19999999999999996

Wrap again, as shown in the preceding section, to get a multi-target version.

For distance-based loss functions, wrapping in Measure is not strictly necessary, but does no harm.

Measure search (experimental feature)

using StatisticalMeasures
using ScientificTypes

y = rand(3)
yhat = rand(3)
options = measures(yhat, y, supports_weights=true)
LittleDict{Any, Any, Vector{Any}, Vector{Any}} with 8 entries:
  LPLoss                              => (aliases = ("l1", "l2", "mae", "mav", …
  LPSumLoss                           => (aliases = ("l1_sum", "l2_sum"), consu…
  RootMeanSquaredError                => (aliases = ("rms", "rmse", "root_mean_…
  RootMeanSquaredLogError             => (aliases = ("rmsl", "rmsle", "root_mea…
  RootMeanSquaredLogProportionalError => (aliases = ("rmslp1",), consumes_multi…
  RootMeanSquaredProportionalError    => (aliases = ("rmsp",), consumes_multipl…
  MeanAbsoluteProportionalError       => (aliases = ("mape",), consumes_multipl…
  LogCoshLoss                         => (aliases = ("log_cosh", "log_cosh_loss…
options[LPLoss]
(aliases = ("l1", "l2", "mae", "mav", "mean_absolute_error", "mean_absolute_value"), consumes_multiple_observations = true, can_report_unaggregated = true, kind_of_proxy = LearnAPI.LiteralTarget(), observation_scitype = Union{Missing, Infinite}, can_consume_tables = false, supports_weights = true, supports_class_weights = true, orientation = Loss(), external_aggregation_mode = Mean(), human_name = "``L^p`` loss")
measures("Matthew")
LittleDict{Any, Any, Vector{Any}, Vector{Any}} with 1 entry:
  MatthewsCorrelation => (aliases = ("matthews_correlation", "mcc"), consumes_m…