Learning networks 2

To ensure code in this tutorial runs as shown, download the tutorial project folder and follow these instructions.

If you have questions or suggestions about this tutorial, please open an issue here.

Preliminary steps

Let's start as with the previous tutorial:

using MLJ
using StableRNGs
import DataFrames: DataFrame
Ridge = @load RidgeRegressor pkg=MultivariateStats

rng = StableRNG(6616) # for reproducibility
x1 = rand(rng, 300)
x2 = rand(rng, 300)
x3 = rand(rng, 300)
y = exp.(x1 - x2 -2x3 + 0.1*rand(rng, 300))
X = DataFrame(x1=x1, x2=x2, x3=x3)

test, train = partition(eachindex(y), 0.8);
import MLJMultivariateStatsInterface ✔

In this tutorial we will show how to generate a model from a network; there are two approaches:

  • using the @from_network macro

  • writing the model in full

the first approach should usually be the one considered as it's simpler.

Generating a model from a network allows subsequent composition of that network with other tasks and tuning of that network.

Using the @from_network macro

Let's define a simple network

Input layer

Xs = source(X)
ys = source(y)
Source @942 ⏎ `AbstractVector{ScientificTypesBase.Continuous}`

First layer

std_model = Standardizer()
stand = machine(std_model, Xs)
W = transform(stand, Xs)

box_model = UnivariateBoxCoxTransformer()
box_mach = machine(box_model, ys)
z = transform(box_mach, ys)
Node{Machine{UnivariateBoxCoxTransformer,…}}
  args:
    1:	Source @942
  formula:
    transform(
        Machine{UnivariateBoxCoxTransformer,…}, 
        Source @942)

Second layer

ridge_model = Ridge(lambda=0.1)
ridge = machine(ridge_model, W, z)
ẑ = predict(ridge, W)
Node{Machine{RidgeRegressor,…}}
  args:
    1:	Node{Machine{Standardizer,…}}
  formula:
    predict(
        Machine{RidgeRegressor,…}, 
        transform(
            Machine{Standardizer,…}, 
            Source @752))

Output

ŷ = inverse_transform(box_mach, ẑ)
Node{Machine{UnivariateBoxCoxTransformer,…}}
  args:
    1:	Node{Machine{RidgeRegressor,…}}
  formula:
    inverse_transform(
        Machine{UnivariateBoxCoxTransformer,…}, 
        predict(
            Machine{RidgeRegressor,…}, 
            transform(
                Machine{Standardizer,…}, 
                Source @752)))

No fitting has been done thus far, we have just defined a sequence of operations.

As we show next, a learning network needs to be exported to create a new stand-alone model type. Instances of that type can be bound with data in a machine, which can then be evaluated, for example. Somewhat paradoxically, one can wrap a learning network in a certain kind of machine, called a learning network machine, before exporting it, and in fact, the export process actually requires us to do so. Since a composite model type does not yet exist, one constructs the machine using a "surrogate" model, whose name indicates the ultimate model supertype (Deterministic, Probabilistic, Unsupervised or Static). This surrogate model has no fields.

surrogate = Deterministic()
mach = machine(surrogate, Xs, ys; predict=ŷ)

fit!(mach)
predict(mach, X[test[1:5], :])
5-element Vector{Float64}:
 0.22207406272038532
 0.1145988140862213
 0.5637023411242004
 0.6208523052884072
 0.36914116568152006

To form a model out of that network is easy using the @from_network macro.

Having defined a learning network machine, mach, as above, the following code defines a new model subtype WrappedRegressor <: Supervised with a single field regressor

@from_network mach begin
    mutable struct CompositeModel
        regressor=ridge_model
    end
end

The macro defines a constructor CompositeModel and attributes a name to the different models; the ordering / connection between the nodes is inferred from via the <= ŷ.

Note: had the model been probabilistic (e.g. RidgeClassifier) you would have needed to add is_probabilistic=true at the end.

cm = machine(CompositeModel(), X, y)
res = evaluate!(cm, resampling=Holdout(fraction_train=0.8, rng=51),
                measure=rms)
round(res.measurement[1], sigdigits=3)
0.0136

Defining a model from scratch

An alternative to the @from_network, is to fully define a new model with its fit method:

mutable struct CompositeModel2 <: DeterministicComposite
    std_model::Standardizer
    box_model::UnivariateBoxCoxTransformer
    ridge_model::Ridge
end

function MLJ.fit(m::CompositeModel2, verbosity::Int, X, y)
    Xs = source(X)
    ys = source(y)
    W = MLJ.transform(machine(m.std_model, Xs), Xs)
    box = machine(m.box_model, ys)
    z = MLJ.transform(box, ys)
    ẑ = predict(machine(m.ridge_model, W, z), W)
    ŷ = inverse_transform(box, ẑ)
    mach = machine(Deterministic(), Xs, ys; predict=ŷ)
    return!(mach, m, verbosity - 1)
end

mdl = CompositeModel2(Standardizer(), UnivariateBoxCoxTransformer(),
                      Ridge(lambda=0.1))
cm = machine(mdl, X, y)
res = evaluate!(cm, resampling=Holdout(fraction_train=0.8), measure=rms)
round(res.measurement[1], sigdigits=3)
0.0212

Either way you now have a constructor to a model which can be used as a stand-alone object, tuned and composed as you would with any basic model.