Tuning models
In MLJ tuning is implemented as a model wrapper. After wrapping a model in a tuning strategy and binding the wrapped model to data in a machine, fitting the machine instigates a search for optimal model hyperparameters, within the specified range, and then uses all supplied data to train the best model. Making predictions using this fitted machine then amounts to predicting using a machine based on the unwrapped model with the specified hyperparameters optimized. In this way the wrapped model may be viewed as a "self-tuning" version of the unwrapped model.
Tuning a single hyperparameter
julia> using MLJ
julia> X = (x1=rand(100), x2=rand(100), x3=rand(100));
julia> y = 2X.x1 - X.x2 + 0.05*rand(100);
julia> tree_model = @load DecisionTreeRegressor;
Let's tune min_purity_increase
in the model above, using a grid-search. Defining hyperparameter ranges and wrapping the model:
julia> r = range(tree_model, :min_purity_increase, lower=0.001, upper=1.0, scale=:log);
julia> self_tuning_tree_model = TunedModel(model=tree_model,
resampling = CV(nfolds=3),
tuning = Grid(resolution=10),
ranges = r,
measure = rms);
Incidentally, for a numeric hyperparameter, the object returned by range
can be iterated after specifying a resolution:
julia> iterator(r, 5)
5-element Array{Float64,1}:
0.0010000000000000002
0.005623413251903492
0.0316227766016838
0.1778279410038923
1.0
Non-numeric hyperparameters are handled a little differently:
julia> selector_model = FeatureSelector();
julia> r2 = range(selector_model, :features, values = [[:x1,], [:x1, :x2]]);
julia> iterator(r2)
2-element Array{Array{Symbol,1},1}:
[:x1]
[:x1, :x2]
Returning to the wrapped tree model:
julia> self_tuning_tree = machine(self_tuning_tree_model, X, y);
julia> fit!(self_tuning_tree, verbosity=0);
We can inspect the detailed results of the grid search with report(self_tuning_model)
or just retrieve the optimal model, as here:
julia> fitted_params(self_tuning_tree).best_model
MLJModels.DecisionTree_.DecisionTreeRegressor(pruning_purity_threshold = 0.0,
max_depth = -1,
min_samples_leaf = 5,
min_samples_split = 2,
min_purity_increase = 0.010000000000000005,
n_subfeatures = 0,
post_prune = false,) @ 7…87
Predicting on new input observations using the optimal model:
julia> predict(self_tuning_tree, (x1=rand(3), x2=rand(3), x3=rand(3)))
3-element Array{Float64,1}:
0.3119152151984453
-0.5194249737412547
-0.15481625577159025
Tuning multiple nested hyperparameters
The following model has another model, namely a DecisionTreeRegressor
, as a hyperparameter:
julia> tree_model = DecisionTreeRegressor()
julia> forest_model = EnsembleModel(atom=tree_model);
Nested hyperparameters can be inspected using params
(or just type @more
in the REPL after instantiating forest_model
):
julia> params(forest_model)
(atom = (pruning_purity_threshold = 0.0,
max_depth = -1,
min_samples_leaf = 5,
min_samples_split = 2,
min_purity_increase = 0.0,
n_subfeatures = 0,
post_prune = false,),
weights = Float64[],
bagging_fraction = 0.8,
rng = MersenneTwister(UInt32[0xa0dfd3ec, 0xb06a7f43, 0x4d1b35b1, 0x4bce6f09]),
n = 100,
parallel = true,
out_of_bag_measure = Any[],)
Ranges for nested hyperparameters are specified using dot syntax:
julia> r1 = range(forest_model, :(atom.n_subfeatures), lower=1, upper=3);
julia> r2 = range(forest_model, :bagging_fraction, lower=0.4, upper=1.0);
julia> self_tuning_forest_model = TunedModel(model=forest_model,
tuning=Grid(resolution=12),
resampling=CV(nfolds=6),
ranges=[r1, r2],
measure=rms);
julia> self_tuning_forest = machine(self_tuning_forest_model, X, y);
julia> fit!(self_tuning_forest, verbosity=0)
Machine{DeterministicTunedModel} @ 2…14
julia> report(self_tuning_forest)
(parameter_names = ["atom.n_subfeatures" "bagging_fraction"],
parameter_scales = Symbol[:linear :linear],
parameter_values = Any[1 0.4; 2 0.4; … ; 2 1.0; 3 1.0],
measurements = [0.357474, 0.23404, 0.212038, 0.336099, 0.216023, 0.1974, 0.333818, 0.20375, 0.181967, 0.325204 … 0.166108, 0.288931, 0.174947, 0.171064, 0.281982, 0.169761, 0.178293, 0.286694, 0.177608, 0.197921],
best_measurement = 0.16467565533494774,)
In this two-parameter case, a plot of the grid search results is also available:
using Plots
plot(self_tuning_forest)
API
Base.range
— Function.r = range(model, :hyper; values=nothing)
Defines a NominalRange
object for a field hyper
of model
, assuming the field is a not a subtype of Real
. Note that r
is not directly iterable but iterator(r)
iterates over values
.
A nested hyperparameter is specified using dot notation. For example, :(atom.max_depth)
specifies the :max_depth
hyperparameter of the hyperparameter :atom
of model
.
r = range(model, :hyper; upper=nothing, lower=nothing, scale=:linear)
Defines a NumericRange
object for a Real
field hyper
of model
. Note that r
is not directly iteratable but iterator(r, n)
iterates over n
values between lower
and upper
values, according to the specified scale
. The supported scales are :linear, :log, :log10, :log2
. Values for Integer
types are rounded (with duplicate values removed, resulting in possibly less than n
values).
Alternatively, if a function f
is provided as scale
, then iterator(r, n)
iterates over the values [f(x1), f(x2), ... , f(xn)]
, where x1, x2, ..., xn
are linearly spaced between lower
and upper
.
MLJ.TunedModel
— Function.tuned_model = TunedModel(; model=nothing,
tuning=Grid(),
resampling=Holdout(),
measure=nothing,
weights=nothing,
operation=predict,
ranges=ParamRange[],
full_report=true)
Construct a model wrapper for hyperparameter optimization of a supervised learner.
Calling fit!(mach)
on a machine mach=machine(tuned_model, X, y)
or mach=machine(tuned_model, task)
will:
Instigate a search, over clones of
model
, with the hyperparameter mutations specified byranges
, for a model optimizing the specifiedmeasure
, using performance evaluations carried out using the specifiedtuning
strategy andresampling
strategy.Fit an internal machine, based on the optimal model
fitted_params(mach).best_model
, wrapping the optimalmodel
object in all the provided dataX, y
(or intask
). Callingpredict(mach, Xnew)
then returns predictions onXnew
of this internal machine.
Important. If a custom measure measure
is used, and the measure is a score, rather than a loss, be sure to check that MLJ.orientation(measure) == :score
to ensure maximization of the measure, rather than minimization. Overide an incorrect value with MLJ.orientation(::typeof(measure)) = :score
.
If measure
supports sample weights (MLJ.supports_weights(measure) == true
) then these can be passed to the measure as weights
.
In the case of two-parameter tuning, a Plots.jl plot of performance estimates is returned by plot(mach)
or heatmap(mach)
.