Learning Curves

A learning curve in MLJ is a plot of some performance estimate, as a function of some model hyperparameter. This can be useful when tuning a single model hyperparameter, or when deciding how many iterations are required for some iterative model. The learning_curve method does not actually generate a plot, but generates the data needed to do so.

To generate learning curves you can bind data to a model by instantiating a machine. You can choose to supply all available data, as performance estimates are computed using a resampling strategy, defaulting to Holdout(fraction_train=0.7).

X, y = @load_boston;

atom = @load RidgeRegressor pkg=MultivariateStats
ensemble = EnsembleModel(atom=atom, n=1000)
mach = machine(ensemble, X, y)

r_lambda = range(ensemble, :(atom.lambda), lower=10, upper=500, scale=:log10)
curve = MLJ.learning_curve(mach;
                           range=r_lambda,
                           resampling=CV(nfolds=3),
                           measure=mav)
using Plots
plot(curve.parameter_values,
     curve.measurements,
     xlab=curve.parameter_name,
     xscale=curve.parameter_scale,
     ylab = "CV estimate of RMS error")

In the case the range hyperparameter is the number of iterations in some iterative model, learning_curve will not restart the training from scratch for each new value, unless a non-holdout resampling strategy is specified (and provided the model implements an appropriate update method). To obtain multiple curves (that are distinct) you will need to pass the name of the model random number generator, rng_name, and specify the random number generators to be used using rngs=... (an integer automatically generates the number specified):

atom.lambda=200
r_n = range(ensemble, :n, lower=1, upper=50)
curves = MLJ.learning_curve(mach;
                             range=r_n,
                             verbosity=0,
                             rng_name=:rng,
                             rngs=4)
plot(curves.parameter_values,
     curves.measurements,
     xlab=curves.parameter_name,
     ylab="Holdout estimate of RMS error")

API reference

MLJTuning.learning_curveFunction
curve = learning_curve(mach; resolution=30,
                             resampling=Holdout(),
                             repeats=1,
                             measure=rms,
                             weights=nothing,
                             operation=predict,
                             range=nothing,
                             acceleration=default_resource(),
                             acceleration_grid=CPU1(),
                             rngs=nothing,
                             rng_name=nothing)

Given a supervised machine mach, returns a named tuple of objects suitable for generating a plot of performance estimates, as a function of the single hyperparameter specified in range. The tuple curve has the following keys: :parameter_name, :parameter_scale, :parameter_values, :measurements.

To generate multiple curves for a model with a random number generator (RNG) as a hyperparameter, specify the name, rng_name, of the (possibly nested) RNG field, and a vector rngs of RNG's, one for each curve. Alternatively, set rngs to the number of curves desired, in which case RNG's are automatically generated. The individual curve computations can be distributed across multiple processes using acceleration=CPUProcesses(). See the second example below for a demonstration.

X, y = @load_boston;
atom = @load RidgeRegressor pkg=MultivariateStats
ensemble = EnsembleModel(atom=atom, n=1000)
mach = machine(ensemble, X, y)
r_lambda = range(ensemble, :(atom.lambda), lower=10, upper=500, scale=:log10)
curve = learning_curve(mach; range=r_lambda, resampling=CV(), measure=mav)
using Plots
plot(curve.parameter_values,
     curve.measurements,
     xlab=curve.parameter_name,
     xscale=curve.parameter_scale,
     ylab = "CV estimate of RMS error")

If using a Holdout() resampling strategy (with no shuffling) and if the specified hyperparameter is the number of iterations in some iterative model (and that model has an appropriately overloaded MLJModelInterface.update method) then training is not restarted from scratch for each increment of the parameter, ie the model is trained progressively.

atom.lambda=200
r_n = range(ensemble, :n, lower=1, upper=250)
curves = learning_curve(mach; range=r_n, verbosity=0, rng_name=:rng, rngs=3)
plot!(curves.parameter_values,
     curves.measurements,
     xlab=curves.parameter_name,
     ylab="Holdout estimate of RMS error")

learning_curve(model::Supervised, X, y; kwargs...)
learning_curve(model::Supervised, X, y, w; kwargs...)

Plot a learning curve (or curves) directly, without first constructing a machine.