MultitargetGaussianMixtureRegressor

mutable struct MultitargetGaussianMixtureRegressor <: MLJModelInterface.Deterministic

A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.

This is the multi-target version of the model. If you want to predict a single label (y), use the MLJ model GaussianMixtureRegressor.

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ

julia> X, y        = @load_boston;

julia> ydouble     = hcat(y, y .*2  .+5);

julia> modelType   = @load MultitargetGaussianMixtureRegressor pkg = "BetaML" verbosity=0
BetaML.GMM.MultitargetGaussianMixtureRegressor

julia> model       = modelType()
MultitargetGaussianMixtureRegressor(
  n_classes = 3, 
  initial_probmixtures = Float64[], 
  mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)], 
  tol = 1.0e-6, 
  minimum_variance = 0.05, 
  minimum_covariance = 0.0, 
  initialisation_strategy = "kmeans", 
  maximum_iterations = 9223372036854775807, 
  rng = Random._GLOBAL_RNG())

julia> mach        = machine(model, X, ydouble);

julia> fit!(mach);
[ Info: Training machine(MultitargetGaussianMixtureRegressor(n_classes = 3, …), …).
Iter. 1:        Var. of the post  20.46947926187522       Log-likelihood -23662.72770575145

julia> ŷdouble    = predict(mach, X)
506×2 Matrix{Float64}:
 23.3358  51.6717
 23.3358  51.6717
  ⋮       
 16.6843  38.3686
 16.6843  38.3686