Tuning models

Tuning models

In MLJ tuning is implemented as a model wrapper. After wrapping a model in a tuning strategy and binding the wrapped model to data in a machine, fitting the machine instigates a search for optimal model hyperparameters, within the specified range, and then uses all supplied data to train the best model. Making predictions using this fitted machine then amounts to predicting using a machine based on the unwrapped model with the specified hyperparameters optimized. In this way the wrapped model may be viewed as a "self-tuning" version of the unwrapped model.

Tuning a single hyperparameter

julia> using MLJ

julia> X = (x1=rand(100), x2=rand(100), x3=rand(100));

julia> y = 2X.x1 - X.x2 + 0.05*rand(100);

julia> tree_model = @load DecisionTreeRegressor;

Let's tune min_purity_increase in the model above, using a grid-search. Defining hyperparameter ranges and wrapping the model:

julia> r = range(tree_model, :min_purity_increase, lower=0.001, upper=1.0, scale=:log);

julia> self_tuning_tree_model = TunedModel(model=tree_model,
                                           resampling = CV(nfolds=3),
                                           tuning = Grid(resolution=10),
                                           ranges = r,
                                           measure = rms);

Incidentally, for a numeric hyperparameter, the object returned by range can be iterated after specifying a resolution:

julia> iterator(r, 5)
5-element Array{Float64,1}:
 0.0010000000000000002
 0.005623413251903492
 0.0316227766016838
 0.1778279410038923
 1.0

Non-numeric hyperparameters are handled a little differently:

julia> selector_model = FeatureSelector();

julia> r2 = range(selector_model, :features, values = [[:x1,], [:x1, :x2]]);

julia> iterator(r2)
2-element Array{Array{Symbol,1},1}:
 [:x1]
 [:x1, :x2]

Returning to the wrapped tree model:

julia> self_tuning_tree = machine(self_tuning_tree_model, X, y);

julia> fit!(self_tuning_tree, verbosity=0);

We can inspect the detailed results of the grid search with report(self_tuning_model) or just retrieve the optimal model, as here:

julia> fitted_params(self_tuning_tree).best_model
MLJModels.DecisionTree_.DecisionTreeRegressor(pruning_purity_threshold = 0.0,
                                              max_depth = -1,
                                              min_samples_leaf = 5,
                                              min_samples_split = 2,
                                              min_purity_increase = 0.010000000000000005,
                                              n_subfeatures = 0,
                                              post_prune = false,) @ 7…87

Predicting on new input observations using the optimal model:

julia> predict(self_tuning_tree, (x1=rand(3), x2=rand(3), x3=rand(3)))
3-element Array{Float64,1}:
  0.3119152151984453
 -0.5194249737412547
 -0.15481625577159025

Tuning multiple nested hyperparameters

The following model has another model, namely a DecisionTreeRegressor, as a hyperparameter:

julia> tree_model = DecisionTreeRegressor()
julia> forest_model = EnsembleModel(atom=tree_model); 

Nested hyperparameters can be inspected using params (or just type @more in the REPL after instantiating forest_model):

julia> params(forest_model)
(atom = (pruning_purity_threshold = 0.0,
         max_depth = -1,
         min_samples_leaf = 5,
         min_samples_split = 2,
         min_purity_increase = 0.0,
         n_subfeatures = 0,
         post_prune = false,),
 weights = Float64[],
 bagging_fraction = 0.8,
 rng = MersenneTwister(UInt32[0xa0dfd3ec, 0xb06a7f43, 0x4d1b35b1, 0x4bce6f09]),
 n = 100,
 parallel = true,
 out_of_bag_measure = Any[],)

Ranges for nested hyperparameters are specified using dot syntax:

julia> r1 = range(forest_model, :(atom.n_subfeatures), lower=1, upper=3);

julia> r2 = range(forest_model, :bagging_fraction, lower=0.4, upper=1.0);

julia> self_tuning_forest_model = TunedModel(model=forest_model,
                                             tuning=Grid(resolution=12),
                                             resampling=CV(nfolds=6),
                                             ranges=[r1, r2],
                                             measure=rms);

julia> self_tuning_forest = machine(self_tuning_forest_model, X, y);

julia> fit!(self_tuning_forest, verbosity=0)
Machine{DeterministicTunedModel} @ 2…14

julia> report(self_tuning_forest)
(parameter_names = ["atom.n_subfeatures" "bagging_fraction"],
 parameter_scales = Symbol[:linear :linear],
 parameter_values = Any[1 0.4; 2 0.4; … ; 2 1.0; 3 1.0],
 measurements = [0.357474, 0.23404, 0.212038, 0.336099, 0.216023, 0.1974, 0.333818, 0.20375, 0.181967, 0.325204  …  0.166108, 0.288931, 0.174947, 0.171064, 0.281982, 0.169761, 0.178293, 0.286694, 0.177608, 0.197921],
 best_measurement = 0.16467565533494774,)

In this two-parameter case, a plot of the grid search results is also available:

using Plots
plot(self_tuning_forest)

API

Base.rangeFunction.
r = range(model, :hyper; values=nothing)

Defines a NominalRange object for a field hyper of model, assuming the field is a not a subtype of Real. Note that r is not directly iterable but iterator(r) iterates over values.

A nested hyperparameter is specified using dot notation. For example, :(atom.max_depth) specifies the :max_depth hyperparameter of the hyperparameter :atom of model.

r = range(model, :hyper; upper=nothing, lower=nothing, scale=:linear)

Defines a NumericRange object for a Real field hyper of model. Note that r is not directly iteratable but iterator(r, n) iterates over n values between lower and upper values, according to the specified scale. The supported scales are :linear, :log, :log10, :log2. Values for Integer types are rounded (with duplicate values removed, resulting in possibly less than n values).

Alternatively, if a function f is provided as scale, then iterator(r, n) iterates over the values [f(x1), f(x2), ... , f(xn)], where x1, x2, ..., xn are linearly spaced between lower and upper.

source
MLJ.TunedModelFunction.
tuned_model = TunedModel(; model=nothing,
                         tuning=Grid(),
                         resampling=Holdout(),
                         measure=nothing,
                         weights=nothing,
                         operation=predict,
                         ranges=ParamRange[],
                         full_report=true)

Construct a model wrapper for hyperparameter optimization of a supervised learner.

Calling fit!(mach) on a machine mach=machine(tuned_model, X, y) or mach=machine(tuned_model, task) will:

  • Instigate a search, over clones of model, with the hyperparameter mutations specified by ranges, for a model optimizing the specified measure, using performance evaluations carried out using the specified tuning strategy and resampling strategy.

  • Fit an internal machine, based on the optimal model fitted_params(mach).best_model, wrapping the optimal model object in all the provided data X, y (or in task). Calling predict(mach, Xnew) then returns predictions on Xnew of this internal machine.

Important. If a custom measure measure is used, and the measure is a score, rather than a loss, be sure to check that MLJ.orientation(measure) == :score to ensure maximization of the measure, rather than minimization. Overide an incorrect value with MLJ.orientation(::typeof(measure)) = :score.

If measure supports sample weights (MLJ.supports_weights(measure) == true) then these can be passed to the measure as weights.

In the case of two-parameter tuning, a Plots.jl plot of performance estimates is returned by plot(mach) or heatmap(mach).

source