Machines
Under the hood, calling fit!
on a machine calls either MLJBase.fit
or MLJBase.update
, depending on the machine's internal state (as recorded in private fields old_model
and old_rows
). These lower-level fit
and update
methods, which are not ordinarily called directly by the user, dispatch on the model and a view of the data defined by the optional rows
keyword argument of fit!
(all rows by default). In this way, if a model update
method has been implemented for the model, calls to fit!
can avoid redundant calculations for certain kinds of model mutations (eg, increasing the number of epochs in a neural network).
forest = EnsembleModel(atom=(@load DecisionTreeClassifier), n=10);
X, y = @load_iris;
mach = machine(forest, X, y)
fit!(mach, verbosity=2);
Machine{ProbabilisticEnsembleModel{DecisionTreeClassifier}} @172 trained 1 time.
args:
1: Source @156 ⏎ `Table{AbstractArray{Continuous,1}}`
2: Source @092 ⏎ `AbstractArray{Multiclass{3},1}`
Generally, changing a hyperparameter triggers retraining on calls to subsequent fit!
:
julia> forest.bagging_fraction=0.5
0.5
julia> fit!(mach, verbosity=2);
[ Info: Updating Machine{ProbabilisticEnsembleModel{DecisionTreeClassifier}} @172.
[ Info: Truncating existing ensemble.
However, for this iterative model, increasing the iteration parameter only adds models to the existing ensemble:
julia> forest.n=15
15
julia> fit!(mach, verbosity=2);
[ Info: Updating Machine{ProbabilisticEnsembleModel{DecisionTreeClassifier}} @172.
[ Info: Building on existing ensemble of length 10
[ Info: One hash per new atom trained:
#####
Call fit!
again without making a change and no retraining occurs:
julia> fit!(mach);
[ Info: Not retraining Machine{ProbabilisticEnsembleModel{DecisionTreeClassifier}} @172. Use `force=true` to force.
However, retraining can be forced:
julia> fit!(mach, force=true);
[ Info: Training Machine{ProbabilisticEnsembleModel{DecisionTreeClassifier}} @172.
And is re-triggered if the view of the data changes:
julia> fit!(mach, rows=1:100);
[ Info: Training Machine{ProbabilisticEnsembleModel{DecisionTreeClassifier}} @172.
julia> fit!(mach, rows=1:100);
[ Info: Not retraining Machine{ProbabilisticEnsembleModel{DecisionTreeClassifier}} @172. Use `force=true` to force.
Inspecting machines
There are two methods for inspecting the outcomes of training in MLJ. To obtain a named-tuple describing the learned parameters (in a user-friendly way where possible) use fitted_params(mach)
. All other training-related outcomes are inspected with report(mach)
.
X, y = @load_iris
pca = @load PCA
mach = machine(pca, X)
fit!(mach)
Machine{PCA} @511 trained 1 time.
args:
1: Source @717 ⏎ `Table{AbstractArray{Continuous,1}}`
julia> fitted_params(mach)
(projection = [-0.3615896773814497 0.656539883285832 0.5809972798276167; 0.08226888989221419 0.7297123713264958 -0.596418087938103; -0.8565721052905279 -0.17576740342865457 -0.07252407548696274; -0.3588439262482157 -0.07470647013503341 -0.549060910726604],)
julia> report(mach)
(indim = 4,
outdim = 3,
tprincipalvar = 4.545608248041779,
tresidualvar = 0.02368302712600201,
tvar = 4.569291275167781,
mean = [5.8433333333333355, 3.054000000000001, 3.7586666666666697, 1.1986666666666674],
principalvars = [4.224840768320109, 0.24224357162751542, 0.07852390809415459],)
MLJModelInterface.fitted_params
— Functionfitted_params(mach)
Return the learned parameters for a machine mach
that has been fit!
, for example the coefficients in a linear model.
This is a named tuple and human-readable if possible.
If mach
is a machine for a composite model, such as a model constructed using @pipeline
, then the returned named tuple has the composite type's field names as keys. The corresponding value is the fitted parameters for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)
using MLJ
@load LogisticClassifier pkg=MLJLinearModels
X, y = @load_crabs;
pipe = @pipeline Standardizer LogisticClassifier
mach = machine(pipe, X, y) |> fit!
julia> fitted_params(mach).logistic_classifier
(classes = CategoricalArrays.CategoricalValue{String,UInt32}["B", "O"],
coefs = Pair{Symbol,Float64}[:FL => 3.7095037897680405, :RW => 0.1135739140854546, :CL => -1.6036892745322038, :CW => -4.415667573486482, :BD => 3.238476051092471],
intercept = 0.0883301599726305,)
Additional keys, machines
and fitted_params_given_machine
, give a list of all machines in the underlying network, and a dictionary of fitted parameters keyed on those machines.
```
MLJBase.report
— Functionreport(mach)
Return the report for a machine mach
that has been fit!
, for example the coefficients in a linear model.
This is a named tuple and human-readable if possible.
If mach
is a machine for a composite model, such as a model constructed using @pipeline
, then the returned named tuple has the composite type's field names as keys. The corresponding value is the report for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)
using MLJ
@load LinearBinaryClassifier pkg=GLM
X, y = @load_crabs;
pipe = @pipeline Standardizer LinearBinaryClassifier
mach = machine(pipe, X, y) |> fit!
julia> report(mach).linear_binary_classifier
(deviance = 3.8893386087844543e-7,
dof_residual = 195.0,
stderror = [18954.83496713119, 6502.845740757159, 48484.240246060406, 34971.131004997274, 20654.82322484894, 2111.1294584763386],
vcov = [3.592857686311793e8 9.122732393971942e6 … -8.454645589364915e7 5.38856837634321e6; 9.122732393971942e6 4.228700272808351e7 … -4.978433790526467e7 -8.442545425533723e6; … ; -8.454645589364915e7 -4.978433790526467e7 … 4.2662172244975924e8 2.1799125705781363e7; 5.38856837634321e6 -8.442545425533723e6 … 2.1799125705781363e7 4.456867590446599e6],)
Additional keys, machines
and report_given_machine
, give a list of all machines in the underlying network, and a dictionary of reports keyed on those machines.
```
Constructing machines
A machine is constructed with the syntax machine(model, args...)
where the possibilities for args
(called training arguments) are summarized in table below. Here X
and y
represent inputs and target, respectively, and Xout
the output of a transform
call. Machines for supervised models may have additional training arguments, such as a vector of per-observation weights (in which case supports_weights(model) == true
).
model supertype | machine constructor calls | operation calls (first compulsory) |
---|---|---|
Deterministic <: Supervised | machine(model, X, y, extras...) | predict(mach, Xnew) , transform(mach, Xnew) , inverse_transform(mach, Xout) |
Probabilistic <: Supervised | machine(model, X, y, extras...) | predict(mach, Xnew) , predict_mean(mach, Xnew) , predict_median(mach, Xnew) , predict_mode(mach, Xnew) , transform(mach, Xnew) , inverse_transform(mach, Xout) |
Unsupervised (except Static ) | machine(model, X) | transform(mach, Xnew) , inverse_transform(mach, Xout) , predict(mach, Xnew) |
Static | machine(model) | transform(mach, Xnews...) , inverse_transform(mach, Xout) |
All operations on machines (predict
, transform
, etc) have exactly one argument (Xnew
or Xout
above) after mach
, the machine instance. An exception is a machine bound to a Static
model, which can have any number of arguments after mach
. For more on Static
transformers (which have no training arguments) see Static transformers.
A machine is reconstructed from a file using the syntax machine("my_machine.jlso")
, or machine("my_machine.jlso", args...)
if retraining using new data. See Saving machines below.
Constructing machines in learning networks
Instead of data X
, y
, etc, the machine
constructor is provided Node
or Source
objects ("dynamic data") when building a learning network. See Composing Models for more on this advanced feature. One also uses machine
to wrap a machine around a whole learning network; see Learning network machines.
Saving machines
To save a machine to file, use the MLJ.save
command:
tree = @load DecisionTreeClassifier
mach = fit!(machine(tree, X, y))
MLJ.save("my_machine.jlso", mach)
To de-serialize, one uses the machine
constructor:
mach2 = machine("my_machine.jlso")
predict(mach2, Xnew);
The machine mach2
cannot be retrained; however, by providing data to the constructor one can enable retraining using the saved model hyperparameters (which overwrites the saved learned parameters):
mach3 = machine("my_machine.jlso", Xnew, ynew)
fit!(mach3)
Internals
For a supervised machine the predict
method calls a lower-level MLJBase.predict
method, dispatched on the underlying model and the fitresult
(see below). To see predict
in action, as well as its unsupervised cousins transform
and inverse_transform
, see Getting Started.
The fields of a Machine
instance (which should not generally be accessed by the user) are:
model
- the struct containing the hyperparameters to be used in calls tofit!
fitresult
- the learned parameters in a raw form, initially undefinedargs
- a tuple of the data, each element wrapped in a source node; see Learning Networks (in the supervised learning example above,args = (source(X), source(y))
)report
- outputs of training not encoded infitresult
(eg, feature rankings)old_model
- a deep copy of the model used in the last call tofit!
old_rows
- a copy of the row indices used in last call tofit!
cache
The interested reader can learn more on machine internals by examining the simplified code excerpt in Internals.
API Reference
MLJBase.machine
— Functionmachine(model, args...)
Construct a Machine
object binding a model
, storing hyper-parameters of some machine learning algorithm, to some data, args
. When building a learning network, Node
objects can be substituted for concrete data.
machine(Xs; oper1=node1, oper2=node2)
machine(Xs, ys; oper1=node1, oper2=node2)
machine(Xs, ys, extras...; oper1=node1, oper2=node2, ...)
Construct a special machine called a learning network machine, that "wraps" a learning network, usually in preparation to export the network as a stand-alone composite model type. The keyword arguments declare what nodes are called when operations, such as predict
and transform
, are called on the machine.
In addition to the operations named in the constructor, the methods fit!
, report
, and fitted_params
can be applied as usual to the machine constructed.
machine(Probablistic(), args...; kwargs...)
machine(Deterministic(), args...; kwargs...)
machine(Unsupervised(), args...; kwargs...)
machine(Static(), args...; kwargs...)
Same as above, but specifying explicitly the kind of model the learning network is to meant to represent.
Learning network machines are not to be confused with an ordinary machine that happens to be bound to a stand-alone composite model (i.e., an exported learning network).
Examples
Supposing a supervised learning network's final predictions are obtained by calling a node yhat
, then the code
mach = machine(Deterministic(), Xs, ys; predict=yhat)
fit!(mach; rows=train)
predictions = predict(mach, Xnew) # `Xnew` concrete data
is equivalent to
fit!(yhat, rows=train)
predictions = yhat(Xnew)
Here Xs
and ys
are the source nodes receiving, respectively, the input and target data.
In a unsupervised learning network for clustering, with single source node Xs
for inputs, and in which the node Xout
delivers the output of dimension reduction, and yhat
the class labels, one can write
mach = machine(Unsupervised(), Xs; transform=Xout, predict=yhat)
fit!(mach)
transformed = transform(mach, Xnew) # `Xnew` concrete data
predictions = predict(mach, Xnew)
which is equivalent to
fit!(Xout)
fit!(yhat)
transformed = Xout(Xnew)
predictions = yhat(Xnew)
StatsBase.fit!
— Functionfit!(mach::Machine, rows=nothing, verbosity=1, force=false)
Fit the machine mach
. In the case that mach
has Node
arguments, first train all other machines on which mach
depends.
To attempt to fit a machine without touching any other machine, use fit_only!
. For more on the internal logic of fitting see fit_only!
fit!(N::Node;
rows=nothing,
verbosity=1,
force=false,
acceleration=CPU1())
Train all machines required to call the node N
, in an appropriate order. These machines are those returned by machines(N)
.
fit!(mach::Machine{<:Surrogate};
rows=nothing,
acceleration=CPU1(),
verbosity=1,
force=false))
Train the complete learning network wrapped by the machine mach
.
More precisely, if s
is the learning network signature used to construct mach
, then call fit!(N)
, where N = glb(values(s)...)
is a greatest lower bound on the nodes appearing in the signature. For example, if s = (predict=yhat, transform=W)
, then call fit!(glb(yhat, W))
. Here glb
is tuple
overloaded for nodes.
See also machine
MLJBase.fit_only!
— FunctionMLJBase.fit_only!(mach::Machine; rows=nothing, verbosity=1, force=false)
Without mutating any other machine on which it may depend, perform one of the following actions to the machine mach
, using the data and model bound to it, and restricting the data to rows
if specified:
Ab initio training. Ignoring any previous learned parameters and cache, compute and store new learned parameters. Increment
mach.state
.Training update. Making use of previous learned parameters and/or cache, replace or mutate existing learned parameters. The effect is the same (or nearly the same) as in ab initio training, but may be faster or use less memory, assuming the model supports an update option (implements
MLJBase.update
). Incrementmach.state
.No-operation. Leave existing learned parameters untouched. Do not increment
mach.state
.
Training action logic
For the action to be a no-operation, either mach.frozen == true
or none of the following apply:
(i)
mach
has never been trained (mach.state == 0
).(ii)
force == true
(iii) The
state
of some other machine on whichmach
depends has changed since the last timemach
was trained (ie, the last timemach.state
was last incremented)(iv) The specified
rows
have changed since the last retraining.(v)
mach.model
has changed since the last retraining.
In cases (i) - (iv), mach
is trained ab initio. In case (v) a training update is applied.
To freeze or unfreeze mach
, use freeze!(mach)
or thaw!(mach)
.
Implementation detail
The data to which a machine is bound is stored in mach.args
. Each element of args
is either a Node
object, or, in the case that concrete data was bound to the machine, it is concrete data wrapped in a Source
node. In all cases, to obtain concrete data for actual training, each argument N
is called, as in N()
or N(rows=rows)
, and either MLJBase.fit
(ab initio training) or MLJBase.update
(training update) is dispatched on mach.model
and this data. See the "Adding models for general use" section of the MLJ documentation for more on these lower-level training methods.
MLJModelInterface.save
— FunctionMLJ.save(filename, mach::Machine; kwargs...)
MLJ.save(io, mach::Machine; kwargs...)
MLJBase.save(filename, mach::Machine; kwargs...)
MLJBase.save(io, mach::Machine; kwargs...)
Serialize the machine mach
to a file with path filename
, or to an input/output stream io
(at least IOBuffer
instances are supported).
The format is JLSO (a wrapper for julia native or BSON serialization). For some model types, a custom serialization will be additionally performed.
Keyword arguments
These keyword arguments are passed to the JLSO serializer:
keyword | values | default |
---|---|---|
format | :julia_serialize , :BSON | :julia_serialize |
compression | :gzip , :none | :none |
See (see https://github.com/invenia/JLSO.jl for details.
Any additional keyword arguments are passed to model-specific serializers.
Machines are de-serialized using the machine
constructor as shown in the example below. Data (or nodes) may be optionally passed to the constructor for retraining on new data using the saved model.
Example
using MLJ
tree = @load DecisionTreeClassifier
X, y = @load_iris
mach = fit!(machine(tree, X, y))
MLJ.save("tree.jlso", mach, compression=:none)
mach_predict_only = machine("tree.jlso")
predict(mach_predict_only, X)
mach2 = machine("tree.jlso", selectrows(X, 1:100), y[1:100])
predict(mach2, X) # same as above
fit!(mach2) # saved learned parameters are over-written
predict(mach2, X) # not same as above
# using a buffer:
io = IOBuffer()
MLJ.save(io, mach)
seekstart(io)
predict_only_mach = machine(io)
predict(predict_only_mach, X)
Maliciously constructed JLSO files, like pickles, and most other general purpose serialization formats, can allow for arbitrary code execution during loading. This means it is possible for someone to use a JLSO file that looks like a serialized MLJ machine as a Trojan horse.