Tools for Implementers
method | purpose |
---|---|
@trait | syntactic sugar for declaring traits |
@fix_show | improve display of a measure |
aggregate | for explicit aggregation if multimeasure is not fit to purpose |
StatisticalMeasuresBase.skipinvalid | skip NaN and missing values |
StatisticalMeasuresBase.check_numobs (y1, y2) | check y1 and y2 have same number of observations |
StatisticalMeasuresBase.check_pools (y1, y2) | check y1 and y2 have the same class pools |
StatisticalMeasuresBase.check_weight_support | check if a measure supports specified weights |
StatisticalMeasuresBase.CompositeWeights | combine weights and class_weights into single iterator |
StatisticalMeasuresBase.weighted | broadcast weights over observations without aggregation |
StatisticalMeasuresBase.Wrapper{M} | the abstract type for measure wrappers |
@combination | generate multiple measures from a single scalar function |
Table of convenience methods available for new measure type implementations
Reference
StatisticalMeasuresBase.@trait
— Macro@trait SomeMeasureType trait1=value1 trait2=value2 ...
Declare SomeMeasureType
a type whose instances are measures, and overload the specified traits to have the given values on all such instances.
For example, if AUC
is a type, then
@trait AUC orientation = Loss() supports_weights = true
is equivalent to the declarations
StatisticalMeasuresBase.is_measure(::AUC) = true
StatisticalMeasuresBase.orientation(::AUC) = Score()
StatisticalMeasuresBase.supports_weights(::AUC) = true
StatisticalMeasuresBase.@fix_show
— Macro@fix_show constructor::T
Overload Base.show
to get a human readable display of all objects of the form constructor(args...; kwargs...)
, given an upper bound T
for the type of such objects, that does not supertype any other objects.
Example
Consider this definition of a constructor LP
:
import StatisticalMeasuresBase as API
using StatisticalMeasuresBase
struct LPOnScalars{T}
p::T
end
measure(yhat, y) = abs(yhat - y)^measure.p
LP(; p=2) = multimeasure(LPOnScalars(p))
julia> LP()
multimeasure(LPOnScalars{Int64}(2))
We fix this as follows:
LPType = API.Multimeasure{<:LPOnScalars}
@fix_show LP::LPType
julia> LP()
LP(
p = 2)
StatisticalMeasuresBase.skipinvalid
— Functionskipinvalid(itr; skipnan=true)
Return an iterator over the elements in itr
, skipping missing
and NaN
values. Behavior is similar to skipmissing
.
If skipnan=false
, then skipinvalid
is equivalent to skipmissing
.
skipinvalid(A, B; skipnan=true)
For vectors A
and B
of the same length, return a tuple of vectors (A[mask], B[mask])
where mask[i]
is true
if and only if A[i]
and B[i]
are both valid (neither missing
nor NaN
). Can also be called on other iterators of matching length, such as arrays, but always returns a vector. Does not remove Missing
from the element types if present in the original iterators.
If skipnan=false
, then NaN
s are ignored.
StatisticalMeasuresBase.check_numobs
— Function check_numobs(X, Y)
Check if two objects X
and Y
supporting the MLJUtils.jl numobs
interface have the same number of observations. If they don't, throw an exception.
StatisticalMeasuresBase.check_pools
— Functioncheck_pools(A, B)
If A
and B
are both CategoricalArray
s (or views thereof) check they have the same class pool. If both A
and B
are ordered, check the pools have the same ordering.
If B
an abstract dictionary, check the key set of B
agrees with the class pool of A
, in the case A
is a CategoricalArray
. Otherwise, check it agrees with unique(skipmissing(A))
.
Otherwise perform no checks.
If a check fails throw an exception.
StatisticalMeasuresBase.check_weight_support
— Functioncheck_weight_support(measure, weight_args...)
Check if measure
supports calls of the form measure(ŷ, y, weight_args...)
. Will always accept nothing
as one or both weight arguments. A failed check throws an exception.
StatisticalMeasuresBase.CompositeWeights
— TypeStatisticalMeasuresBase.CompositeWeights(y)
StatisticalMeasuresBase.CompositeWeights(y, weights)
StatisticalMeasuresBase.CompositeWeights(y, weights, class_weights)
StatisticalMeasuresBase.CompositeWeights(y, class_weights::AbstractDict)
Return an iterator which combines, with ordinary multiplication, the specified weights
and class_weights
, given a target y
.
y = ["a", "b", "b", "b"]
weights = [1, 2, 3, 4]
class_weights = Dict("a"=>2, "b"=>1)
combined = StatisticalMeasuresBase.CompositeWeights(y, weights, class_weights)
julia> collect(combined)
4-element Vector{Any}:
2
2
3
4
Omitted or nothing
weights
/class_weights
are interpreted as uniform. Unless nothing
, the length of weights
is expected to be (at least) the number of observations in y
and the keys of class_weights
should include all values of y
.
Class weights transform missing
values of y
to zeros.
y = [missing, "a", "b", "b", "b"]
class_weights = Dict("a"=>2, "b"=>1)
combined = StatisticalMeasuresBase.CompositeWeights(y, class_weights)
collect(combined)
julia> collect(combined)
5-element Vector{Int64}:
0
2
1
1
1
StatisticalMeasuresBase.weighted
— Functionweighted([f, ] itr; weights=nothing, mode=Mean(), skipnan=false)
This method takes the same arguments and keyword arguments as aggregate
but only multiplies the iterator by any specified weights and collects. In the special case mode=RootMean(p)
, the weights are first replaced by their p
th roots, for consistency with how aggregation works in that case.
See also aggregate
StatisticalMeasuresBase.Wrapper
— TypeWrapper{M}
Abstract type for measure wrappers. Here M
is the atomic measure type.
StatisticalMeasuresBase.@combination
— Macro@combination SomeMeasure() = multimeasure(f)
@combination SomeMeasure() = multimeasure(f, mode=...)
Advanced tool for generating multiple measure constructors from a single scalar function, (ŷ, y) -> f(ŷ, y)
. See "Enhancements" below for a variation for parameterized functions.
Assuming f(yhat, y)
is an ordinary function with scalar arguments, the above calls acts as more-or-less as if @combination
were absent, but with the following differences and additional actions:
1. A new concrete measure type SomeMeasureOnScalars
is added: If sm = SomeMeasureOnScalars()
, then sm(yhat, y) = f(yhat, y)
.
2. Specifically, we have
SomeMeasure() = multimeasure(
supports_missings_measure(sm),
mode=mode,
) |> robust_measure |> fussy_measure
so that missing
scalar elements are supported, relevant argument checks are performed, and weight arguments can be nothing
.
3. An additional multi-target constructor is defined:
MultitargetSomeMeasure(; atomic_weights=nothing) = multimeasure(
multimeasure(supports_missings_measure(sm), mode=mode),
atomic_weights=atomic_weights,
transform=vec∘collect,
) |> robust_measure |> fussy_measure
This measure will have similar support for missing
scalar elements and nothing
weights, and will perform argument checks. It can consume some kinds of tables.
4. The show method for displaying both kinds of measure is made friendlier; see @fix_show
.
Note that, by construction, measure = SomeMeasure()
if and only if measure isa W{<:W<:W{<:W{<:SomeMeasureOnScalars}}}
, where W = StatisticalMeasuresBase.Wrapper
, and measure = MultitargetSomeMeasure(; atomic_weights=wts)
for some wts
if and only if measure isa W{<:W{<:W{<:W{<:W{<:SomeMeasureOnScalars}}}}}
.
Enhancements
A single parameter can added to the provided expression, corresponding to the third argument of f
. Traits may be also be declared, as they apply to SomeMeasure
, which are appropriately lifted to MultitargetSomeMeasure
, and dropped to SomeScalarMeasure
.
Enhanced syntax example:
f(yhat, y, tol) = abs(yhat - y)/max(abs(y), tol)
@combination(
ProportionalAbsoluteDifference(; tol=eps()) = multimeasure(f),
observation_scitype = Continuous, # becomes Union{Missing,Continuous}
orientation=Loss(),
)
For further elucidation, see the documentation Tutorial.
This marcro is experimental and its behavior is subject to change in patch and minor releases.