Composition
Composites
MLJBase.anonymize!
— Methodanonymize!(sources...)
Returns a named tuple (sources=..., data=....)
whose values are the provided source nodes and their contents respectively, and clears the contents of those source nodes.
MLJBase.@from_network
— Macro@from_network(NewCompositeModel(fld1=model1, fld2=model2, ...) <= N
@from_network(NewCompositeModel(fld1=model1, fld2=model2, ...) <= N is_probabilistic=false
Create a new stand-alone model type called NewCompositeModel
, using a learning network as a blueprint. Here N
refers to the terminal node of the learning network (from which final predictions or transformations are fetched).
Important. If the learning network is supervised (has a source with kind=:target
) and makes probabilistic predictions, then one must declare is_probabilistic=true
. In the deterministic case the keyword argument can be omitted.
The model type NewCompositeModel
is equipped with fields named :fld1
, :fld2
, ..., which correspond to component models model1
, model2
, ..., appearing in the network (which must therefore be elements of models(N)
). Deep copies of the specified component models are used as default values in an automatically generated keyword constructor for NewCompositeModel
.
Return value
A new NewCompositeModel
instance, with default field values.
For details and examples refer to the "Learning Networks" section of the documentation.
Base.replace
— Methodreplace(W::Node, a1=>b1, a2=>b2, ...; empty_unspecified_sources=false)
Create a deep copy of a node W
, and thereby replicate the learning network terminating at W
, but replacing any specified sources and models a1, a2, ...
of the original network with b1, b2, ...
.
If empty_unspecified_sources=ture
then any source nodes not specified are replaced with empty version of the same kind.
Networks
MLJBase.node
— TypeN = node(f::Function, args...)
Defines a Node
object N
wrapping a static operation f
and arguments args
. Each of the n
elements of args
must be a Node
or Source
object. The node N
has the following calling behaviour:
N() = f(args[1](), args[2](), ..., args[n]())
N(rows=r) = f(args[1](rows=r), args[2](rows=r), ..., args[n](rows=r))
N(X) = f(args[1](X), args[2](X), ..., args[n](X))
J = node(f, mach::NodalMachine, args...)
Defines a dynamic Node
object J
wrapping a dynamic operation f
(predict
, predict_mean
, transform
, etc), a nodal machine mach
and arguments args
. Its calling behaviour, which depends on the outcome of training mach
(and, implicitly, on training outcomes affecting its arguments) is this:
J() = f(mach, args[1](), args[2](), ..., args[n]())
J(rows=r) = f(mach, args[1](rows=r), args[2](rows=r), ..., args[n](rows=r))
J(X) = f(mach, args[1](X), args[2](X), ..., args[n](X))
Generally n=1
or n=2
in this latter case.
predict(mach, X::AbsractNode, y::AbstractNode)
predict_mean(mach, X::AbstractNode, y::AbstractNode)
predict_median(mach, X::AbstractNode, y::AbstractNode)
predict_mode(mach, X::AbstractNode, y::AbstractNode)
transform(mach, X::AbstractNode)
inverse_transform(mach, X::AbstractNode)
Shortcuts for J = node(predict, mach, X, y)
, etc.
Calling a node is a recursive operation which terminates in the call to a source node (or nodes). Calling nodes on new data X
fails unless the number of such nodes is one.
MLJBase.freeze!
— Methodfreeze!(mach)
Freeze the machine mach
so that it will never be retrained (unless thawed).
See also thaw!
.
MLJBase.machines
— Methodmachines(N)
List all machines in the learning network terminating at node N
.
MLJBase.models
— Methodmodels(N::AbstractNode)
A vector of all models referenced by a node N
, each model appearing exactly once.
MLJBase.nodes
— Methodnodes(N)
Return all nodes upstream of a node N
, including N
itself, in an order consistent with the extended directed acyclic graph of the network. Here "extended" means edges corresponding to training arguments are included.
MLJBase.origins
— Methodorigins(N)
Return a list of all origins of a node N
accessed by a call N()
. These are the source nodes of the acyclic directed graph (DAG) associated with the learning network terminating at N
, if edges corresponding to training arguments are excluded. A Node
object cannot be called on new data unless it has a unique origin.
Not to be confused with sources(N)
which refers to the same graph but without the training edge deletions.
MLJBase.rebind!
— Methodrebind!(s)
Attach new data X
to an existing source node s
.
MLJBase.source
— MethodXs = source(X)
ys = source(y, kind=:target)
ws = source(w, kind=:weight)
Defines, respectively, learning network Source
objects for wrapping some input data X
(kind=:input
), some target data y
, or some sample weights w
. The values of each variable X, y, w
can be anything, even nothing
, if the network is for exporting as a stand-alone model only. For training and testing the unexported network, appropriate vectors, tables, or other data containers are expected.
Xs = source()
ys = source(kind=:target)
ws = source(kind=:weight)
Define source nodes wrapping nothing
instead of concrete data. Such definitions suffice if a learning network is to be exported without testing.
The calling behaviour of a Source
object is this:
Xs() = X
Xs(rows=r) = selectrows(X, r) # eg, X[r,:] for a DataFrame
Xs(Xnew) = Xnew
MLJBase.sources
— Methodsources(N::AbstractNode; kind=:any)
A vector of all sources referenced by calls N()
and fit!(N)
. These are the sources of the directed acyclic graph associated with the learning network terminating at N
, including training edges. The return value can be restricted further by specifying kind=:input
, kind=:target
, kind=:weight
, etc.
Not to be confused with origins(N)
which refers to the same graph with edges corresponding to training arguments deleted.
MLJBase.thaw!
— MethodMLJModelInterface.selectcols
— Methodselectcols(X::AbstractNode, c)
Returns Node
object N
such that N() = selectcols(X(), c)
.
MLJModelInterface.selectrows
— Methodselectrows(X::AbstractNode, r)
Returns a Node
object N
such that N() = selectrows(X(), r)
(and N(rows=s) = selectrows(X(rows=s), r)
).
StatsBase.fit!
— Methodfit!(N::Node; rows=nothing, verbosity::Int=1, force::Bool=false)
Train all machines in the learning network terminating at node N
, in an appropriate order. These machines are those returned by machines(N)
.
MLJBase.args
— Methodargs(tree; train=false)
Return a vector of the top level args of the tree associated with a node. If train=true
, return the train_args
.
MLJBase.is_stale
— MethodMLJBase.is_stale
— Methodis_stale(N)
Check if a node N
is stale.
MLJBase.reset!
— Methodreset!(N::Node)
Place the learning network terminating at node N
into a state in which fit!(N)
will retrain from scratch all machines in its dependency tape. Does not actually train any machine or alter fit-results. (The method simply resets m.state
to zero, for every machine m
in the network.)
MLJBase.state
— Methodstate(mach)
Return the state of a machine, mach
.
MLJBase.state
— Methodstate(N)
Return the state of a node N
MLJBase.tree
— Methodtree(N)
Return a description of the tree N
defined by the learning network terminating at a given node.
Pipelines
MLJBase.@pipeline
— Macro@pipeline NewPipeType(fld1=model1, fld2=model2, ...)
@pipeline NewPipeType(fld1=model1, fld2=model2, ...) prediction_type=:probabilistic
Create a new pipeline model type NewPipeType
that composes the types of the specified models model1
, model2
, ... . The models are composed in the specified order, meaning the input(s) of the pipeline goes to model1
, whose output is sent to model2
, and so forth.
At most one of the models may be a supervised model, in which case NewPipeType
is supervised. Otherwise it is unsupervised.
The new model type NewPipeType
has hyperparameters (fields) named :fld1
, :fld2
, ..., whose default values for an automatically generated keyword constructor are deep copies of model1
, model2
, ... .
Important. If the overall pipeline is supervised and makes probabilistic predictions, then one must declare prediction_type=:probabilistic
. In the deterministic case no declaration is necessary.
Static (unlearned) transformations - that is, ordinary functions - may also be inserted in the pipeline as shown in the following example (the classifier is probabilistic but the pipeline itself is deterministic):
@pipeline MyPipe(X -> coerce(X, :age=>Continuous),
hot=OneHotEncoder(),
cnst=ConstantClassifier(),
yhat -> mode.(yhat))
Return value
An instance of the new type, with default hyperparameters (see above), is returned.
Target transformation and inverse transformation
A learned target transformation (such as standardization) can also be specified, using the keyword target
, provided the transformer provides an inverse_transform
method:
@load KNNRegressor
@pipeline MyPipe(hot=OneHotEncoder(),
knn=KNNRegressor(),
target=UnivariateTransformer())
A static transformation can be specified instead, but then an inverse
must also be given:
@load KNNRegressor
@pipeline MyPipe(hot=OneHotEncoder(),
knn=KNNRegressor(),
target = v -> log.(v),
inverse = v -> exp.(v))
Important. While the supervised model in a pipeline providing a target transformation can appear anywhere in the pipeline (as in ConstantClassifier
example above), the inverse operation is always performed on the output of the final model or static transformation in the pipeline.
See also: @from_network
MLJBase.StaticTransformer
— TypeApplies a given data transformation f
(either a function or callable).
Field
f=identity
: function or callable object to use for the data transformation.