Examples of Usage
Calling syntax
A measure m
is called with this syntax:
m(ŷ, y)
m(ŷ, y, weights)
m(ŷ, y, class_weights::AbstractDict)
m(ŷ, y, weights, class_weights)
where y
is ground truth and ŷ
predictions. This package provides measure constructors, such as BalancedAccuracy
:
using StatisticalMeasures
using StatisticalMeasures
m = BalancedAccuracy(adjusted=true)
m(["O", "X", "O", "X"], ["X", "X", "X", "O"], [1, 2, 1, 2])
-0.5
Aliases are provided for commonly applied instances:
bacc == BalancedAccuracy() == BalancedAccuracy(adjusted=false)
true
Contents
- Binary classification
- Multi-class classification
- Probabilistic classification
- Non-probabilistic regression
- Probabilistic regression
- Custom multi-target measures
- Using losses from LossFunctions.jl
- Measure search (experimental feature)
Binary classification
using StatisticalMeasures
using CategoricalArrays
# ground truth:
y = categorical(
["X", "X", "X", "O", "X", "X", "O", "O", "X"],
ordered=true,
)
# prediction:
ŷ = categorical(
["O", "X", "O", "X", "O", "O", "O", "X", "X"],
levels=levels(y),
ordered=true,
)
accuracy(ŷ, y)
0.3333333333333333
weights = [1, 2, 1, 2, 1, 2, 1, 2, 1]
accuracy(ŷ, y, weights)
0.4444444444444444
class_weights = Dict("X" => 10, "O" => 1)
accuracy(ŷ, y, class_weights)
2.3333333333333335
accuracy(ŷ, y, weights, class_weights)
3.4444444444444446
To get a measurement for each individual observation, use measurements
:
measurements(accuracy, ŷ, y, weights, class_weights)
9-element Vector{Int64}:
0
20
0
0
0
0
1
0
10
kappa(ŷ, y)
-0.28571428571428564
mat = confmat(ŷ, y)
┌─────────────┐
│Ground Truth │
┌─────────┼──────┬──────┤
│Predicted│ O │ X │
├─────────┼──────┼──────┤
│ O │ 1 │ 4 │
├─────────┼──────┼──────┤
│ X │ 2 │ 2 │
└─────────┴──────┴──────┘
Some measures can be applied directly to confusion matrices:
kappa(mat)
-0.28571428571428564
Multi-class classification
using StatisticalMeasures
using CategoricalArrays
import Random
Random.seed!()
y = rand("ABC", 1000) |> categorical
ŷ = rand("ABC", 1000) |> categorical
class_weights = Dict('A' => 1, 'B' =>2, 'C' => 10)
MulticlassFScore(beta=0.5, average=MacroAvg())(ŷ, y, class_weights)
1.3275622571655565
MulticlassFScore(beta=0.5, average=NoAvg())(ŷ, y, class_weights)
LittleDict{CategoricalArrays.CategoricalValue{Char, UInt32}, Float64, Tuple{CategoricalArrays.CategoricalValue{Char, UInt32}, CategoricalArrays.CategoricalValue{Char, UInt32}, CategoricalArrays.CategoricalValue{Char, UInt32}}, Tuple{Float64, Float64, Float64}} with 3 entries:
'A' => 0.322289
'B' => 0.596745
'C' => 3.06365
Unseen classes are tracked, when using CategoricalArrays
, as here:
# find 'C'-free indices
mask = y .!= 'C' .&& ŷ .!= 'C';
# remove observations with 'C' class::
y = y[mask]
ŷ = ŷ[mask]
'C' in y ∪ ŷ
false
confmat(ŷ, y)
┌──────────────┐
│ Ground Truth │
┌─────────┼────┬────┬────┤
│Predicted│ A │ B │ C │
├─────────┼────┼────┼────┤
│ A │107 │112 │ 0 │
├─────────┼────┼────┼────┤
│ B │112 │ 99 │ 0 │
├─────────┼────┼────┼────┤
│ C │ 0 │ 0 │ 0 │
└─────────┴────┴────┴────┘
Probabilistic classification
To mitigate ambiguity around representations of predicted probabilities, a probabilistic prediction of categorical data is expected to be represented by a UnivariateFinite
distribution, from the package CategoricalDistributions.jl. This is the form delivered, for example, by MLJ classification models.
using StatisticalMeasures
using CategoricalArrays
using CategoricalDistributions
y = categorical(["X", "O", "X", "X", "O", "X", "X", "O", "O", "X"], ordered=true)
X_probs = [0.3, 0.2, 0.4, 0.9, 0.1, 0.4, 0.5, 0.2, 0.8, 0.7]
ŷ = UnivariateFinite(["O", "X"], X_probs, augment=true, pool=y)
ŷ[1]
UnivariateFinite{OrderedFactor{2}}(O=>0.7, X=>0.3)
auc(ŷ, y)
0.7916666666666666
measurements(log_loss, ŷ, y)
10-element Vector{Float64}:
1.2039728043259361
0.2231435513142097
0.916290731874155
0.10536051565782628
0.10536051565782628
0.916290731874155
0.6931471805599453
0.2231435513142097
1.6094379124341005
0.35667494393873245
measurements(brier_score, ŷ, y)
10-element Vector{Float64}:
-0.9800000000000001
-0.08000000000000007
-0.72
-0.020000000000000018
-0.020000000000000018
-0.72
-0.5
-0.08000000000000007
-1.2800000000000002
-0.18000000000000016
We note in passing that mode
and pdf
methods can be applied to UnivariateFinite
distributions. So, for example, we can do:
confmat(mode.(ŷ), y)
┌─────────────┐
│Ground Truth │
┌─────────┼──────┬──────┤
│Predicted│ O │ X │
├─────────┼──────┼──────┤
│ O │ 3 │ 4 │
├─────────┼──────┼──────┤
│ X │ 1 │ 2 │
└─────────┴──────┴──────┘
Non-probabilistic regression
using StatisticalMeasures
y = [0.1, -0.2, missing, 0.7]
ŷ = [-0.2, 0.1, 0.4, 0.7]
rsquared(ŷ, y)
0.5789473684210524
weights = [1, 3, 2, 5]
rms(ŷ, y, weights)
0.30000000000000004
measurements(LPLoss(p=2.5), ŷ, y, weights)
4-element Vector{Union{Missing, Float64}}:
0.049295030175464966
0.1478850905263949
missing
0.0
Here's an example of a multi-target regression measure, for data with 3 observations of a 2-component target:
# last index is observation index:
y = [1 2 3; 2 4 6]
ŷ = [2 3 4; 4 6 8]
weights = [8, 7, 6]
ŷ - y
2×3 Matrix{Int64}:
1 1 1
2 2 2
MultitargetLPLoss(p=2.5)(ŷ, y, weights)
23.29898987322333
# one "atomic weight" per component of target:
MultitargetLPLoss(p=2.5, atomic_weights = [1, 10])(ŷ, y, weights)
201.4898987322333
Some tabular formats (e.g., DataFrame
) are also supported:
using Tables
t = y' |> Tables.table |> Tables.rowtable
t̂ = ŷ' |> Tables.table |> Tables.rowtable
MultitargetLPLoss(p=2.5)(ŷ, y, weights)
23.29898987322333
Probabilistic regression
using StatisticalMeasures
import Distributions:Poisson, Normal
import Random.seed!
seed!()
y = rand(20)
ŷ = [Normal(rand(), 0.5) for i in 1:20]
ŷ[1]
Distributions.Normal{Float64}(μ=0.5015431521573479, σ=0.5)
log_loss(ŷ, y)
0.4504182739554906
weights = rand(20)
log_loss(ŷ, y, weights)
0.25880837386663325
weights = rand(20)
measurements(log_loss, ŷ, y, weights)
20-element Vector{Float64}:
0.2917898365240672
0.3664724439639467
0.5909333010607659
0.1985005077072399
0.022487809070239242
0.19227222448718267
0.011362539794069986
0.027067922361291977
0.9983979779165483
0.10821003183823857
0.03394455270607971
0.10795288969219827
0.2188768895455741
0.2650340179368862
0.08330620923654473
0.06133103675865898
0.17247038976874973
0.14910575372820814
0.1837909919722805
0.7786967749698127
An example with Count
(integer) data:
y = rand(1:10, 20)
ŷ = [Poisson(10*rand()) for i in 1:20]
ŷ[1]
Distributions.Poisson{Float64}(λ=6.471248879459091)
brier_loss(ŷ, y)
0.0036111113539912033
Custom multi-target measures
Here's an example of constructing a multi-target regression measure, for data with 3 observations of a 2-component target:
using StatisticalMeasures
# last index is observation index:
y = ["X" "O" "O"; "O" "X" "X"]
ŷ = ["O" "X" "O"; "O" "O" "O"]
2×3 Matrix{String}:
"O" "X" "O"
"O" "O" "O"
# if prescribed, we need one "atomic weight" per component of target:
multitarget_accuracy= multimeasure(accuracy, atomic_weights=[1, 2])
multitarget_accuracy(ŷ, y)
0.5
measurements(multitarget_accuracy, ŷ, y)
3-element Vector{Float64}:
1.0
0.0
0.5
# one weight per observation:
weights = [1, 2, 10]
measurements(multitarget_accuracy, ŷ, y, weights)
3-element Vector{Float64}:
1.0
0.0
5.0
See multimeasure
for options. Refer to the StatisticalMeausureBase.jl documentation for advanced measure customization.
Using losses from LossFunctions.jl
The margin losses in LossFunctions.jl can be regarded as binary probabilistic measures, but they cannot be directly called on CategoricalValue
s and UnivariateFinite
distributions, as we do for similar measures provided by StatisticalMeasures
(see Probabilistic classification above). If we want this latter behavior, then we need to wrap these losses using Measure
:
using StatisticalMeasures
import LossFunctions as LF
loss = Measure(LF.L1HingeLoss())
Measure(LossFunctions.L1HingeLoss())
This loss can only be called on scalars (true for LossFunctions.jl losses since v0.10):
using CategoricalArrays
using CategoricalDistributions
y = categorical(["X", "O", "X", "X"], ordered=true)
X_probs = [0.3, 0.2, 0.4, 0.9]
ŷ = UnivariateFinite(["O", "X"], X_probs, augment=true, pool=y)
loss(ŷ[1], y[1])
1.4
This is remedied with the multimeasure
wrapper:
import StatisticalMeasuresBase.Sum
loss_on_vectors = multimeasure(loss, mode=Sum())
loss_on_vectors(ŷ, y)
0.8
class_weights = Dict("X"=>1, "O"=>10)
loss_on_vectors(ŷ, y, class_weights)
1.6999999999999997
measurements(loss_on_vectors, ŷ, y)
4-element Vector{Float64}:
1.4
0.3999999999999999
1.2
0.19999999999999996
Wrap again, as shown in the preceding section, to get a multi-target version.
For distance-based loss functions, wrapping in Measure
is not strictly necessary, but does no harm.
Measure search (experimental feature)
using StatisticalMeasures
using ScientificTypes
y = rand(3)
yhat = rand(3)
options = measures(yhat, y, supports_weights=true)
LittleDict{Any, Any, Vector{Any}, Vector{Any}} with 8 entries:
LPLoss => (aliases = ("l1", "l2", "mae", "mav", …
LPSumLoss => (aliases = ("l1_sum", "l2_sum"), consu…
RootMeanSquaredError => (aliases = ("rms", "rmse", "root_mean_…
RootMeanSquaredLogError => (aliases = ("rmsl", "rmsle", "root_mea…
RootMeanSquaredLogProportionalError => (aliases = ("rmslp1",), consumes_multi…
RootMeanSquaredProportionalError => (aliases = ("rmsp",), consumes_multipl…
MeanAbsoluteProportionalError => (aliases = ("mape",), consumes_multipl…
LogCoshLoss => (aliases = ("log_cosh", "log_cosh_loss…
options[LPLoss]
(aliases = ("l1", "l2", "mae", "mav", "mean_absolute_error", "mean_absolute_value"), consumes_multiple_observations = true, can_report_unaggregated = true, kind_of_proxy = LearnAPI.LiteralTarget(), observation_scitype = Union{Missing, Infinite}, can_consume_tables = false, supports_weights = true, supports_class_weights = true, orientation = Loss(), external_aggregation_mode = Mean(), human_name = "``L^p`` loss")
measures("Matthew")
LittleDict{Any, Any, Vector{Any}, Vector{Any}} with 1 entry:
MatthewsCorrelation => (aliases = ("matthews_correlation", "mcc"), consumes_m…