Tutorial  |  What is a measure?
StatisticalMeasuresBase.jl
A Julia package for building production-ready measures (metrics) for statistics and machine learning

The main idea

Here's an example of a simple statistical measure, applied to a pair of scalars:

l1(ŷ, y) = abs(ŷ - y)
y = 5 # ground truth
ŷ = 2 # prediction
l1(ŷ, y)
3

Wrappers provided in this package extend the functionality of such measures. For example:

using StatisticalMeasuresBase
L1 = multimeasure(supports_missings_measure(l1), mode=Sum())
y = [5, 6, missing]
ŷ = [6, 8, 7]
weights = [1, 3, 9]
L1(ŷ, y, weights) ≈ 1*l1(6, 5) + 3*l1(8, 6)
true
multitarget_L1 = multimeasure(L1, transform=vec∘collect)
# 3 observations (last index is observation index):
y = [1 2 3; 2 4 6]
ŷ = [2 3 4; 4 6 8]
multitarget_L1(ŷ, y, weights)
39
using DataFrames
df    = DataFrame(y', :auto)
df̂    = DataFrame(ŷ', :auto)
multitarget_L1(df̂, df, weights)
39

Generate measurements for each observation with the measurements method:

measurements(multitarget_L1, df̂, df, weights)
3-element Vector{Int64}:
  3
  9
 27

Overview

This package specifies an interface for statistical measures (metrics) such as classical loss functions, confusion matrices, and proper scoring rules. It also provides wrappers for extending their functionality. It does not implement actual measures. For a package that does, based on this interface, see StatisticalMeasures.jl. The wrappers can also be applied to measures provided by other packages, such as LossFunctions.jl.

Specifically, this package provides:

  • A measure wrapper multimeasure that leverages MLUtils.jl to broadcast a simple measure over multiple observations; the main use case is for extending a measure (e.g., function) that consumes single observations to measures consuming vectors, arrays or tables (multi-target measures).

  • Other wrappers to add missing value support, argument checks, or to silently treat unsupported weights as uniform (good for application of a batch of measures with mixed degrees of weight support)

  • measurements, a method to return unaggregated measurements

  • A number of optional traits to articulate contracts useful for client packages; for example, optimization packages may only work with measures that overload the orientation trait.

  • aggregate, a multipurpose measurement aggregation tool.

  • Technical tools for implementing new measures, such as CompositeWeights, which combines per-observation weights and class weights into a single iterator.

Contents