Boston with Flux
To ensure code in this tutorial runs as shown, download the tutorial project folder and follow these instructions.If you have questions or suggestions about this tutorial, please open an issue here.
Main author: Ayush Shridhar (ayush-1506).
import MLJFlux
import MLJ
import DataFrames: DataFrame
import Statistics
import Flux
using Random
Random.seed!(11)
Random.TaskLocalRNG()
Loading the Boston dataset. Our aim will be to implement a neural network regressor to predict the price of a house, given a number of features.
features, targets = MLJ.@load_boston
features = DataFrame(features)
@show size(features)
@show targets[1:3]
first(features, 3) |> MLJ.pretty
size(features) = (506, 12)
targets[1:3] = [24.0, 21.6, 34.7]
┌────────────┬────────────┬────────────┬────────────┬────────────┬────────────┬────────────┬────────────┬────────────┬────────────┬────────────┬────────────┐
│ Crim │ Zn │ Indus │ NOx │ Rm │ Age │ Dis │ Rad │ Tax │ PTRatio │ Black │ LStat │
│ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │
│ Continuous │ Continuous │ Continuous │ Continuous │ Continuous │ Continuous │ Continuous │ Continuous │ Continuous │ Continuous │ Continuous │ Continuous │
├────────────┼────────────┼────────────┼────────────┼────────────┼────────────┼────────────┼────────────┼────────────┼────────────┼────────────┼────────────┤
│ 0.00632 │ 18.0 │ 2.31 │ 0.538 │ 6.575 │ 65.2 │ 4.09 │ 1.0 │ 296.0 │ 15.3 │ 396.9 │ 4.98 │
│ 0.02731 │ 0.0 │ 7.07 │ 0.469 │ 6.421 │ 78.9 │ 4.9671 │ 2.0 │ 242.0 │ 17.8 │ 396.9 │ 9.14 │
│ 0.02729 │ 0.0 │ 7.07 │ 0.469 │ 7.185 │ 61.1 │ 4.9671 │ 2.0 │ 242.0 │ 17.8 │ 392.83 │ 4.03 │
└────────────┴────────────┴────────────┴────────────┴────────────┴────────────┴────────────┴────────────┴────────────┴────────────┴────────────┴────────────┘
Next obvious steps: partitioning into train and test set
train, test = MLJ.partition(collect(eachindex(targets)), 0.70, rng=52)
([358, 422, 334, 476, 1, 441, 12, 115, 240, 104, 208, 158, 46, 504, 462, 101, 157, 92, 287, 360, 385, 330, 475, 465, 117, 300, 246, 230, 105, 38, 436, 481, 424, 44, 73, 296, 61, 244, 371, 14, 195, 444, 489, 235, 143, 428, 172, 66, 318, 323, 232, 74, 338, 77, 57, 23, 357, 437, 401, 127, 397, 356, 404, 136, 260, 4, 327, 121, 432, 445, 43, 19, 304, 468, 141, 47, 280, 85, 342, 440, 51, 169, 67, 168, 231, 361, 126, 54, 396, 190, 270, 164, 409, 176, 383, 352, 184, 322, 156, 416, 398, 197, 329, 220, 377, 60, 71, 494, 266, 491, 479, 130, 369, 109, 53, 214, 179, 380, 39, 119, 233, 316, 469, 213, 114, 457, 211, 152, 408, 324, 155, 319, 171, 276, 50, 102, 482, 82, 139, 420, 15, 206, 151, 486, 410, 209, 203, 364, 473, 10, 34, 282, 120, 285, 227, 68, 317, 98, 7, 459, 100, 133, 478, 439, 186, 97, 177, 159, 18, 228, 466, 362, 320, 99, 267, 212, 484, 40, 153, 279, 337, 339, 281, 249, 359, 349, 302, 224, 25, 325, 488, 69, 76, 265, 429, 268, 91, 255, 333, 123, 111, 415, 321, 33, 226, 256, 106, 129, 183, 307, 165, 95, 471, 196, 435, 229, 70, 348, 273, 137, 373, 26, 90, 506, 28, 303, 161, 449, 311, 447, 204, 414, 116, 378, 326, 480, 63, 382, 312, 306, 501, 8, 41, 247, 288, 393, 163, 388, 328, 310, 6, 474, 89, 375, 167, 16, 505, 201, 79, 443, 346, 49, 202, 347, 110, 374, 35, 405, 425, 309, 258, 187, 341, 86, 216, 24, 343, 138, 94, 248, 314, 455, 308, 88, 294, 419, 78, 81, 293, 215, 406, 427, 407, 417, 376, 194, 490, 344, 118, 27, 472, 103, 182, 42, 198, 36, 386, 236, 87, 200, 289, 52, 413, 456, 336, 400, 144, 83, 389, 237, 502, 412, 181, 162, 134, 191, 430, 219, 9, 331, 292, 173, 438, 243, 446, 125, 188, 252, 262, 58, 205, 175, 477, 301, 250, 497, 345, 132, 291, 277, 257, 379, 218, 166], [225, 189, 245, 418, 295, 135, 463, 487, 37, 207, 332, 434, 210, 283, 391, 21, 297, 59, 17, 238, 193, 387, 241, 275, 448, 217, 62, 458, 298, 452, 146, 150, 22, 470, 45, 503, 11, 426, 363, 467, 128, 498, 32, 154, 461, 56, 423, 160, 402, 251, 3, 131, 199, 464, 495, 353, 254, 64, 234, 96, 263, 284, 442, 372, 399, 313, 365, 500, 80, 454, 122, 5, 367, 113, 20, 223, 315, 29, 384, 72, 272, 499, 421, 394, 286, 174, 261, 453, 450, 112, 366, 269, 274, 93, 13, 185, 492, 148, 354, 278, 2, 305, 259, 239, 124, 335, 392, 75, 142, 108, 170, 140, 149, 350, 180, 460, 192, 340, 290, 451, 264, 431, 395, 485, 351, 381, 271, 145, 178, 55, 496, 411, 493, 370, 390, 107, 403, 65, 31, 222, 221, 299, 355, 483, 30, 433, 84, 242, 368, 147, 48, 253])
Let us try to implement an Neural Network regressor using Flux.jl. MLJFlux.jl provides an MLJ interface to the Flux.jl deep learning framework. The package provides four essential models: NeuralNetworkRegressor, MultitargetNeuralNetworkRegressor,
NeuralNetworkClassifier
and ImageClassifier
.
At the heart of these models is a neural network. This is specified using the builder
parameter. Creating a builder object consists of two steps: Step 1: Creating a new struct inherited from MLJFlux.Builder
. MLJFlux.Builder
is an abstract structure used for the purpose of dispatching. Suppose we define a new struct called MyNetworkBuilder
. This can contain any attribute required to build the model later. (Step 2). Let's use Dense Neural Network with 2 hidden layers.
mutable struct MyNetworkBuilder <: MLJFlux.Builder
n1::Int #Number of cells in the first hidden layer
n2::Int #Number of cells in the second hidden layer
end
Step 2: Building the neural network from this object. Extend the MLJFlux.build
function. This takes in 4 arguments: The MyNetworkBuilder
instance, a random number generator or seed rng
, the input dimension (n_in
) and output dimension (n_out
).
function MLJFlux.build(model::MyNetworkBuilder, rng, n_in, n_out)
init = Flux.glorot_uniform(rng)
layer1 = Flux.Dense(n_in, model.n1, init=init)
layer2 = Flux.Dense(model.n1, model.n2, init=init)
layer3 = Flux.Dense(model.n2, n_out, init=init)
return Flux.Chain(layer1, layer2, layer3)
end
Alternatively, there a macro shortcut to take care of both steps at once. For details, do ?MLJFlux.@builder
.
All definitions ready, let us create an object of this:
myregressor = MyNetworkBuilder(20, 10)
MyNetworkBuilder(n1 = 20, …)
Since the boston dataset is a regression problem, we'll be using NeuralNetworkRegressor
here. One thing to remember is that a NeuralNetworkRegressor
object works seamlessly like any other MLJ model: you can wrap it in an MLJ machine
and do anything you'd do otherwise.
Let's start by defining our NeuralNetworkRegressor object, that takes myregressor
as it's parameter.
nnregressor = MLJFlux.NeuralNetworkRegressor(builder=myregressor, epochs=10)
NeuralNetworkRegressor(
builder = MyNetworkBuilder(
n1 = 20,
n2 = 10),
optimiser = Flux.Optimise.Adam(0.001, (0.9, 0.999), 1.0e-8, IdDict{Any, Any}()),
loss = Flux.Losses.mse,
epochs = 10,
batch_size = 1,
lambda = 0.0,
alpha = 0.0,
rng = Random._GLOBAL_RNG(),
optimiser_changes_trigger_retraining = false,
acceleration = ComputationalResources.CPU1{Nothing}(nothing))
Other parameters that NeuralNetworkRegressor takes can be found here: https://github.com/alan-turing-institute/MLJFlux.jl#model-hyperparameters
nnregressor
now acts like any other MLJ model. Let's try wrapping it in a MLJ machine and calling fit!, predict
.
mach = MLJ.machine(nnregressor, features, targets)
untrained Machine; caches model-specific representations of data
model: NeuralNetworkRegressor(builder = MyNetworkBuilder(n1 = 20, …), …)
args:
1: Source @382 ⏎ ScientificTypesBase.Table{AbstractVector{ScientificTypesBase.Continuous}}
2: Source @299 ⏎ AbstractVector{ScientificTypesBase.Continuous}
Let's fit this on the train set
MLJ.fit!(mach, rows=train, verbosity=3)
trained Machine; caches model-specific representations of data
model: NeuralNetworkRegressor(builder = MyNetworkBuilder(n1 = 20, …), …)
args:
1: Source @382 ⏎ ScientificTypesBase.Table{AbstractVector{ScientificTypesBase.Continuous}}
2: Source @299 ⏎ AbstractVector{ScientificTypesBase.Continuous}
As we can see, the training loss decreases at each epoch, showing the the neural network is gradually learning form the training set.
preds = MLJ.predict(mach, features[test, :])
print(preds[1:5])
Float32[31.564112, 29.851883, 24.773237, -9.3287525, 22.552029]
Now let's retrain our model. One thing to remember is that retrainig may OR may not re-initialize our neural network model parameters. For example, changing the number of epochs to 15 will not causes the model to train to 15 epcohs, but just 5 additional epochs.
nnregressor.epochs = 15
MLJ.fit!(mach, rows=train, verbosity=3)
trained Machine; caches model-specific representations of data
model: NeuralNetworkRegressor(builder = MyNetworkBuilder(n1 = 20, …), …)
args:
1: Source @382 ⏎ ScientificTypesBase.Table{AbstractVector{ScientificTypesBase.Continuous}}
2: Source @299 ⏎ AbstractVector{ScientificTypesBase.Continuous}
You can always specify that you want to retrain the model from scratch using the force=true parameter. (Look at documentation for fit!
for more).
However, changing parameters such as batch_size will necessarily cause re-training from scratch.
nnregressor.batch_size = 2
MLJ.fit!(mach, rows=train, verbosity=3)
trained Machine; caches model-specific representations of data
model: NeuralNetworkRegressor(builder = MyNetworkBuilder(n1 = 20, …), …)
args:
1: Source @382 ⏎ ScientificTypesBase.Table{AbstractVector{ScientificTypesBase.Continuous}}
2: Source @299 ⏎ AbstractVector{ScientificTypesBase.Continuous}
Another bit to remember here is that changing the optimiser doesn't cause retaining by default. However, the optimiser_changes_trigger_retraining
in NeuralNetworkRegressor can be toggled to accomodate this. This allows one to modify the learning rate, for example, after an initial burn-in period.
# Inspecting out-of-sample loss as a function of epochs
r = MLJ.range(nnregressor, :epochs, lower=1, upper=30, scale=:log10)
curve = MLJ.learning_curve(nnregressor, features, targets,
range=r,
resampling=MLJ.Holdout(fraction_train=0.7),
measure=MLJ.l2)
using Plots
plot(curve.parameter_values, curve.measurements, yaxis=:log, legend=false)
xlabel!(curve.parameter_name)
ylabel!("l2-log")
As mentioned above, nnregressor
can act like any other MLJ model. Let's try to tune the batch_size parameter.
bs = MLJ.range(nnregressor, :batch_size, lower=1, upper=5)
tm = MLJ.TunedModel(model=nnregressor, ranges=[bs, ], measure=MLJ.l2)
DeterministicTunedModel(
model = NeuralNetworkRegressor(
builder = MyNetworkBuilder(n1 = 20, …),
optimiser = Flux.Optimise.Adam(0.001, (0.9, 0.999), 1.0e-8, IdDict{Any, Any}()),
loss = Flux.Losses.mse,
epochs = 15,
batch_size = 2,
lambda = 0.0,
alpha = 0.0,
rng = Random._GLOBAL_RNG(),
optimiser_changes_trigger_retraining = false,
acceleration = ComputationalResources.CPU1{Nothing}(nothing)),
tuning = RandomSearch(
bounded = Distributions.Uniform,
positive_unbounded = Distributions.Gamma,
other = Distributions.Normal,
rng = Random._GLOBAL_RNG()),
resampling = Holdout(
fraction_train = 0.7,
shuffle = false,
rng = Random._GLOBAL_RNG()),
measure = LPLoss(p = 2),
weights = nothing,
class_weights = nothing,
operation = nothing,
range = MLJBase.NumericRange{Int64, MLJBase.Bounded, Symbol}[NumericRange(1 ≤ batch_size ≤ 5; origin=3.0, unit=2.0)],
selection_heuristic = MLJTuning.NaiveSelection(nothing),
train_best = true,
repeats = 1,
n = nothing,
acceleration = ComputationalResources.CPU1{Nothing}(nothing),
acceleration_resampling = ComputationalResources.CPU1{Nothing}(nothing),
check_measure = true,
cache = true,
compact_history = true,
logger = nothing)
For more on tuning, refer to the model-tuning tutorial.
m = MLJ.machine(tm, features, targets)
MLJ.fit!(m)
trained Machine; does not cache data
model: DeterministicTunedModel(model = NeuralNetworkRegressor(builder = MyNetworkBuilder(n1 = 20, …), …), …)
args:
1: Source @983 ⏎ ScientificTypesBase.Table{AbstractVector{ScientificTypesBase.Continuous}}
2: Source @456 ⏎ AbstractVector{ScientificTypesBase.Continuous}
This evaluated the model at each value of our range. The best value is:
MLJ.fitted_params(m).best_model.batch_size
5