Homegeneous Ensembles
MLJ.EnsembleModel
— Function.EnsembleModel(atom=nothing,
weights=Float64[],
bagging_fraction=0.8,
n=100,
rng=GLOBAL_RNG,
parallel=true,
out_of_bag_measure=[])
Create a model for training an ensemble of n
learners, with optional bagging, each with associated model atom
. Ensembling is useful if fit!(machine(atom, data...))
does not create identical models on repeated calls (ie, is a stochastic model, such as a decision tree with randomized node selection criteria), or if bagging_fraction
is set to a value less than 1.0, or both. The constructor fails if no atom
is specified.
Only atomic models supporting targets with scitype AbstractVector{<:Finite}
(univariate classifiers) or AbstractVector{<:Continuous}
(univariate regressors) are supported.
If rng
is an integer, then MersenneTwister(rng)
is the random number generator used for bagging. Otherwise some AbstractRNG
object is expected.
The atomic predictions are weighted according to the vector weights
(to allow for external optimization) except in the case that atom
is a Deterministic
classifier. Uniform weights are used if weight
has zero length.
The ensemble model is Deterministic
or Probabilistic
, according to the corresponding supertype of atom
. In the case of deterministic classifiers (target_scitype(atom) <: Abstract{<:Finite}
), the predictions are majority votes, and for regressors (target_scitype(atom)<: AbstractVector{<:Continuous}
) they are ordinary averages. Probabilistic predictions are obtained by averaging the atomic probability distribution/mass functions; in particular, for regressors, the ensemble prediction on each input pattern has the type MixtureModel{VF,VS,D}
from the Distributions.jl package, where D
is the type of predicted distribution for atom
.
If a single measure or non-empty vector of measures is specified by out_of_bag_measure
, then out-of-bag estimates of performance are reported.