Tools for Implementers

methodpurpose
@traitsyntactic sugar for declaring traits
@fix_showimprove display of a measure
aggregatefor explicit aggregation if multimeasure is not fit to purpose
StatisticalMeasuresBase.skipinvalidskip NaN and missing values
StatisticalMeasuresBase.check_numobs(y1, y2)check y1 and y2 have same number of observations
StatisticalMeasuresBase.check_pools(y1, y2)check y1 and y2 have the same class pools
StatisticalMeasuresBase.check_weight_supportcheck if a measure supports specified weights
StatisticalMeasuresBase.CompositeWeightscombine weights and class_weights into single iterator
StatisticalMeasuresBase.weightedbroadcast weights over observations without aggregation
StatisticalMeasuresBase.Wrapper{M}the abstract type for measure wrappers
@combinationgenerate multiple measures from a single scalar function

Table of convenience methods available for new measure type implementations

Reference

StatisticalMeasuresBase.@traitMacro
@trait SomeMeasureType trait1=value1 trait2=value2 ...

Declare SomeMeasureType a type whose instances are measures, and overload the specified traits to have the given values on all such instances.

For example, if AUC is a type, then

@trait AUC orientation = Loss() supports_weights = true

is equivalent to the declarations

StatisticalMeasuresBase.is_measure(::AUC) = true
StatisticalMeasuresBase.orientation(::AUC) = Score()
StatisticalMeasuresBase.supports_weights(::AUC) = true
source
StatisticalMeasuresBase.@fix_showMacro
@fix_show constructor::T

Overload Base.show to get a human readable display of all objects of the form constructor(args...; kwargs...), given an upper bound T for the type of such objects, that does not supertype any other objects.

Example

Consider this definition of a constructor LP:

import StatisticalMeasuresBase as API
using StatisticalMeasuresBase

struct LPOnScalars{T}
    p::T
end
measure(yhat, y) = abs(yhat - y)^measure.p

LP(; p=2) = multimeasure(LPOnScalars(p))

julia> LP()
multimeasure(LPOnScalars{Int64}(2))

We fix this as follows:

LPType = API.Multimeasure{<:LPOnScalars}
@fix_show LP::LPType

julia> LP()
LP(
  p = 2)
source
StatisticalMeasuresBase.skipinvalidFunction
skipinvalid(itr; skipnan=true)

Return an iterator over the elements in itr, skipping missing and NaN values. Behavior is similar to skipmissing.

If skipnan=false, then skipinvalid is equivalent to skipmissing.


skipinvalid(A, B; skipnan=true)

For vectors A and B of the same length, return a tuple of vectors (A[mask], B[mask]) where mask[i] is true if and only if A[i] and B[i] are both valid (neither missing nor NaN). Can also be called on other iterators of matching length, such as arrays, but always returns a vector. Does not remove Missing from the element types if present in the original iterators.

If skipnan=false, then NaNs are ignored.

source
StatisticalMeasuresBase.check_numobsFunction
    check_numobs(X, Y)

Check if two objects X and Y supporting the MLJUtils.jl numobs interface have the same number of observations. If they don't, throw an exception.

source
StatisticalMeasuresBase.check_poolsFunction
check_pools(A, B)

If A and B are both CategoricalArrays (or views thereof) check they have the same class pool. If both A and B are ordered, check the pools have the same ordering.

If B an abstract dictionary, check the key set of B agrees with the class pool of A, in the case A is a CategoricalArray. Otherwise, check it agrees with unique(skipmissing(A)).

Otherwise perform no checks.

If a check fails throw an exception.

source
StatisticalMeasuresBase.check_weight_supportFunction
check_weight_support(measure, weight_args...)

Check if measure supports calls of the form measure(ŷ, y, weight_args...). Will always accept nothing as one or both weight arguments. A failed check throws an exception.

source
StatisticalMeasuresBase.CompositeWeightsType
StatisticalMeasuresBase.CompositeWeights(y)
StatisticalMeasuresBase.CompositeWeights(y, weights)
StatisticalMeasuresBase.CompositeWeights(y, weights, class_weights)
StatisticalMeasuresBase.CompositeWeights(y, class_weights::AbstractDict)

Return an iterator which combines, with ordinary multiplication, the specified weights and class_weights, given a target y.

y = ["a", "b", "b", "b"]
weights = [1, 2, 3, 4]
class_weights = Dict("a"=>2, "b"=>1)
combined = StatisticalMeasuresBase.CompositeWeights(y, weights, class_weights)

julia> collect(combined)
4-element Vector{Any}:
 2
 2
 3
 4

Omitted or nothing weights/class_weights are interpreted as uniform. Unless nothing, the length of weights is expected to be (at least) the number of observations in y and the keys of class_weights should include all values of y.

Class weights transform missing values of y to zeros.

y = [missing, "a", "b", "b", "b"]
class_weights = Dict("a"=>2, "b"=>1)
combined = StatisticalMeasuresBase.CompositeWeights(y, class_weights)
collect(combined)
julia> collect(combined)
5-element Vector{Int64}:
 0
 2
 1
 1
 1
source
StatisticalMeasuresBase.weightedFunction
weighted([f, ] itr; weights=nothing, mode=Mean(), skipnan=false)

This method takes the same arguments and keyword arguments as aggregate but only multiplies the iterator by any specified weights and collects. In the special case mode=RootMean(p), the weights are first replaced by their pth roots, for consistency with how aggregation works in that case.

See also aggregate

source
StatisticalMeasuresBase.@combinationMacro
@combination SomeMeasure() = multimeasure(f)
@combination SomeMeasure() = multimeasure(f, mode=...)

Advanced tool for generating multiple measure constructors from a single scalar function, (ŷ, y) -> f(ŷ, y). See "Enhancements" below for a variation for parameterized functions.

Assuming f(yhat, y) is an ordinary function with scalar arguments, the above calls acts as more-or-less as if @combination were absent, but with the following differences and additional actions:

1. A new concrete measure type SomeMeasureOnScalars is added: If sm = SomeMeasureOnScalars(), then sm(yhat, y) = f(yhat, y).

2. Specifically, we have

SomeMeasure() = multimeasure(
    supports_missings_measure(sm),
    mode=mode,
) |> robust_measure |> fussy_measure

so that missing scalar elements are supported, relevant argument checks are performed, and weight arguments can be nothing.

3. An additional multi-target constructor is defined:

MultitargetSomeMeasure(; atomic_weights=nothing) = multimeasure(
    multimeasure(supports_missings_measure(sm), mode=mode),
    atomic_weights=atomic_weights,
    transform=vec∘collect,
) |> robust_measure |> fussy_measure

This measure will have similar support for missing scalar elements and nothing weights, and will perform argument checks. It can consume some kinds of tables.

4. The show method for displaying both kinds of measure is made friendlier; see @fix_show.

Note that, by construction, measure = SomeMeasure() if and only if measure isa W{<:W<:W{<:W{<:SomeMeasureOnScalars}}}, where W = StatisticalMeasuresBase.Wrapper, and measure = MultitargetSomeMeasure(; atomic_weights=wts) for some wts if and only if measure isa W{<:W{<:W{<:W{<:W{<:SomeMeasureOnScalars}}}}}.

Enhancements

A single parameter can added to the provided expression, corresponding to the third argument of f. Traits may be also be declared, as they apply to SomeMeasure, which are appropriately lifted to MultitargetSomeMeasure, and dropped to SomeScalarMeasure.

Enhanced syntax example:

f(yhat, y, tol) = abs(yhat - y)/max(abs(y), tol)
@combination(
    ProportionalAbsoluteDifference(; tol=eps()) = multimeasure(f),
    observation_scitype = Continuous,     # becomes Union{Missing,Continuous}
    orientation=Loss(),
)

For further elucidation, see the documentation Tutorial.

Note

This marcro is experimental and its behavior is subject to change in patch and minor releases.

source