Learning Curves

A learning curve in MLJ is a plot of some performance estimate, as a function of some model hyperparameter. This can be useful when tuning a single model hyperparameter, or when deciding how many iterations are required for some iterative model. The learning_curve! method does not actually generate a plot, but generates the data needed to do so.

To generate learning curves you must bind data to a model by instantiating a machine. You can choose to supply all available data, as performance estimates are computed using a resampling strategy, defaulting to Holdout(fraction_train=0.7).

X, y = @load_boston;

atom = @load RidgeRegressor pkg=MultivariateStats
ensemble = EnsembleModel(atom=atom, n=1000)
mach = machine(ensemble, X, y)

r_lambda = range(ensemble, :(atom.lambda), lower=10, upper=500, scale=:log10)
curve = MLJ.learning_curve!(mach; range=r_lambda, resampling=CV(), measure=mav)
using Plots
plot(curve.parameter_values,
     curve.measurements,
     xlab=curve.parameter_name,
     xscale=curve.parameter_scale,
     ylab = "CV estimate of RMS error")

In the case of the number of iterations in some iterative model, learning_curve! will not restart the training from scratch for each new value, unless a non-holdout resampling strategy is specified (and provided the model implements an appropriate update method).

atom.lambda=200
r_n = range(ensemble, :n, lower=1, upper=250)
curves = MLJ.learning_curve!(mach; range=r_n, verbosity=0, n=5)
plot(curves.parameter_values, 
     curves.measurements, 
     xlab=curves.parameter_name,
     ylab="Holdout estimate of RMS error")

API reference

MLJ.learning_curve! — Function.

curve = learning_curve!(mach; resolution=30,
                              resampling=Holdout(),
                              measure=rms,
                              operation=predict,
                              range=nothing,
                              n=1)

Given a supervised machine mach, returns a named tuple of objects suitable for generating a plot of performance measurements, as a function of the single hyperparameter specified in range. The tuple curve has the following keys: :parameter_name, :parameter_scale, :parameter_values, :measurements.

For n > 1, multiple curves are computed, and the value of curve.measurements is an array, one column for each run. This is useful in the case of models with indeterminate fit-results, such as a random forest.

X, y = @load_boston;
atom = @load RidgeRegressor pkg=MultivariateStats
ensemble = EnsembleModel(atom=atom, n=1000)
mach = machine(ensemble, X, y)
r_lambda = range(ensemble, :(atom.lambda), lower=10, upper=500, scale=:log10)
curve = MLJ.learning_curve!(mach; range=r_lambda, resampling=CV(), measure=mav)
using Plots
plot(curve.parameter_values,
     curve.measurements,
     xlab=curve.parameter_name,
     xscale=curve.parameter_scale,
     ylab = "CV estimate of RMS error")

If using a Holdout resampling strategy, and the specified hyperparameter is the number of iterations in some iterative model (and that model has an appropriately overloaded MLJBase.update method) then training is not restarted from scratch for each increment of the parameter, ie the model is trained progressively.

atom.lambda=200
r_n = range(ensemble, :n, lower=1, upper=250)
curves = MLJ.learning_curve!(mach; range=r_n, verbosity=0, n=5)
plot(curves.parameter_values, 
     curves.measurements, 
     xlab=curves.parameter_name,
     ylab="Holdout estimate of RMS error")

source