Weights
In machine learning it is possible to assign each observation an independent significance, or weight, either in training or in performance evaluation, or both.
There are two kinds of weights in use in MLJ:
per observation weights (also just called weights) refer to weight vectors of the same length as the number of observations
class weights refer to dictionaries keyed on the target classes (levels) for use in classification problems
Specifying weights in training
To specify weights in training you bind the weights to the model along with the data when constructing a machine. For supervised models the weights are specified last:
KNNRegressor = @load KNNRegressor
model = KNNRegressor()
X, y = make_regression(10, 3)
w = rand(length(y))
mach = machine(model, X, y, w) |> fit!
Note that model
supports per observation weights if supports_weights(model)
is true
. To list all such models, do
models() do m
m.supports_weights
end
The model model
supports class weights if supports_class_weights(model)
is true
.
Specifying weights in performance evaluation
When calling a measure (metric) that supports weights, provide the weights as the last argument, as in
_, y = @load_iris
ŷ = shuffle(y)
w = Dict("versicolor" => 1, "setosa" => 2, "virginica"=> 3)
macro_f1score(ŷ, y, w)
Some measures also support specification of a class weight dictionary. For details see the StatisticalMeasures.jl tutorial.
To pass weights to all the measures listed in an evaluate!
/evaluate
call, use the keyword specifiers weights=...
or class_weights=...
. For details, see Evaluating Model Performance.