Tools for Implementers
| method | purpose |
|---|---|
@trait | syntactic sugar for declaring traits |
@fix_show | improve display of a measure |
aggregate | for explicit aggregation if multimeasure is not fit to purpose |
StatisticalMeasuresBase.skipinvalid | skip NaN and missing values |
StatisticalMeasuresBase.check_numobs(y1, y2) | check y1 and y2 have same number of observations |
StatisticalMeasuresBase.check_pools(y1, y2) | check y1 and y2 have the same class pools |
StatisticalMeasuresBase.check_weight_support | check if a measure supports specified weights |
StatisticalMeasuresBase.CompositeWeights | combine weights and class_weights into single iterator |
StatisticalMeasuresBase.weighted | broadcast weights over observations without aggregation |
StatisticalMeasuresBase.Wrapper{M} | the abstract type for measure wrappers |
@combination | generate multiple measures from a single scalar function |
Table of convenience methods available for new measure type implementations
Reference
StatisticalMeasuresBase.@trait — Macro@trait SomeMeasureType trait1=value1 trait2=value2 ...Declare SomeMeasureType a type whose instances are measures, and overload the specified traits to have the given values on all such instances.
For example, if AUC is a type, then
@trait AUC orientation = Loss() supports_weights = trueis equivalent to the declarations
StatisticalMeasuresBase.is_measure(::AUC) = true
StatisticalMeasuresBase.orientation(::AUC) = Score()
StatisticalMeasuresBase.supports_weights(::AUC) = trueStatisticalMeasuresBase.@fix_show — Macro@fix_show constructor::TOverload Base.show to get a human readable display of all objects of the form constructor(args...; kwargs...), given an upper bound T for the type of such objects, that does not supertype any other objects.
Example
Consider this definition of a constructor LP:
import StatisticalMeasuresBase as API
using StatisticalMeasuresBase
struct LPOnScalars{T}
p::T
end
measure(yhat, y) = abs(yhat - y)^measure.p
LP(; p=2) = multimeasure(LPOnScalars(p))
julia> LP()
multimeasure(LPOnScalars{Int64}(2))We fix this as follows:
LPType = API.Multimeasure{<:LPOnScalars}
@fix_show LP::LPType
julia> LP()
LP(
p = 2)StatisticalMeasuresBase.skipinvalid — Functionskipinvalid(itr; skipnan=true)Return an iterator over the elements in itr, skipping missing and NaN values. Behavior is similar to skipmissing.
If skipnan=false, then skipinvalid is equivalent to skipmissing.
skipinvalid(A, B; skipnan=true)For vectors A and B of the same length, return a tuple of vectors (A[mask], B[mask]) where mask[i] is true if and only if A[i] and B[i] are both valid (neither missing nor NaN). Can also be called on other iterators of matching length, such as arrays, but always returns a vector. Does not remove Missing from the element types if present in the original iterators.
If skipnan=false, then NaNs are ignored.
StatisticalMeasuresBase.check_numobs — Function check_numobs(X, Y)Check if two objects X and Y supporting the MLJUtils.jl numobs interface have the same number of observations. If they don't, throw an exception.
StatisticalMeasuresBase.check_pools — Functioncheck_pools(A, B)If A and B are both CategoricalArrays (or views thereof) check they have the same class pool. If both A and B are ordered, check the pools have the same ordering.
If B an abstract dictionary, check the key set of B agrees with the class pool of A, in the case A is a CategoricalArray. Otherwise, check it agrees with unique(skipmissing(A)).
Otherwise perform no checks.
If a check fails throw an exception.
StatisticalMeasuresBase.check_weight_support — Functioncheck_weight_support(measure, weight_args...)Check if measure supports calls of the form measure(ŷ, y, weight_args...). Will always accept nothing as one or both weight arguments. A failed check throws an exception.
StatisticalMeasuresBase.CompositeWeights — TypeStatisticalMeasuresBase.CompositeWeights(y)
StatisticalMeasuresBase.CompositeWeights(y, weights)
StatisticalMeasuresBase.CompositeWeights(y, weights, class_weights)
StatisticalMeasuresBase.CompositeWeights(y, class_weights::AbstractDict)Return an iterator which combines, with ordinary multiplication, the specified weights and class_weights, given a target y.
y = ["a", "b", "b", "b"]
weights = [1, 2, 3, 4]
class_weights = Dict("a"=>2, "b"=>1)
combined = StatisticalMeasuresBase.CompositeWeights(y, weights, class_weights)
julia> collect(combined)
4-element Vector{Any}:
2
2
3
4Omitted or nothing weights/class_weights are interpreted as uniform. Unless nothing, the length of weights is expected to be (at least) the number of observations in y and the keys of class_weights should include all values of y.
Class weights transform missing values of y to zeros.
y = [missing, "a", "b", "b", "b"]
class_weights = Dict("a"=>2, "b"=>1)
combined = StatisticalMeasuresBase.CompositeWeights(y, class_weights)
collect(combined)
julia> collect(combined)
5-element Vector{Int64}:
0
2
1
1
1StatisticalMeasuresBase.weighted — Functionweighted([f, ] itr; weights=nothing, mode=Mean(), skipnan=false)This method takes the same arguments and keyword arguments as aggregate but only multiplies the iterator by any specified weights and collects. In the special case mode=RootMean(p), the weights are first replaced by their pth roots, for consistency with how aggregation works in that case.
See also aggregate
StatisticalMeasuresBase.Wrapper — TypeWrapper{M}Abstract type for measure wrappers. Here M is the atomic measure type.
StatisticalMeasuresBase.@combination — Macro@combination SomeMeasure() = multimeasure(f)
@combination SomeMeasure() = multimeasure(f, mode=...)Advanced tool for generating multiple measure constructors from a single scalar function, (ŷ, y) -> f(ŷ, y). See "Enhancements" below for a variation for parameterized functions.
Assuming f(yhat, y) is an ordinary function with scalar arguments, the above calls acts as more-or-less as if @combination were absent, but with the following differences and additional actions:
1. A new concrete measure type SomeMeasureOnScalars is added: If sm = SomeMeasureOnScalars(), then sm(yhat, y) = f(yhat, y).
2. Specifically, we have
SomeMeasure() = multimeasure(
supports_missings_measure(sm),
mode=mode,
) |> robust_measure |> fussy_measureso that missing scalar elements are supported, relevant argument checks are performed, and weight arguments can be nothing.
3. An additional multi-target constructor is defined:
MultitargetSomeMeasure(; atomic_weights=nothing) = multimeasure(
multimeasure(supports_missings_measure(sm), mode=mode),
atomic_weights=atomic_weights,
transform=vec∘collect,
) |> robust_measure |> fussy_measureThis measure will have similar support for missing scalar elements and nothing weights, and will perform argument checks. It can consume some kinds of tables.
4. The show method for displaying both kinds of measure is made friendlier; see @fix_show.
Note that, by construction, measure = SomeMeasure() if and only if measure isa W{<:W<:W{<:W{<:SomeMeasureOnScalars}}}, where W = StatisticalMeasuresBase.Wrapper, and measure = MultitargetSomeMeasure(; atomic_weights=wts) for some wts if and only if measure isa W{<:W{<:W{<:W{<:W{<:SomeMeasureOnScalars}}}}}.
Enhancements
A single parameter can added to the provided expression, corresponding to the third argument of f. Traits may be also be declared, as they apply to SomeMeasure, which are appropriately lifted to MultitargetSomeMeasure, and dropped to SomeScalarMeasure.
Enhanced syntax example:
f(yhat, y, tol) = abs(yhat - y)/max(abs(y), tol)
@combination(
ProportionalAbsoluteDifference(; tol=eps()) = multimeasure(f),
observation_scitype = Continuous, # becomes Union{Missing,Continuous}
orientation=Loss(),
)For further elucidation, see the documentation Tutorial.
This marcro is experimental and its behavior is subject to change in patch and minor releases.