GaussianMixtureClusterer
mutable struct GaussianMixtureClusterer <: MLJModelInterface.Unsupervised
A Expectation-Maximisation clustering algorithm with customisable mixtures, from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
n_classes::Int64
: Number of mixtures (latent classes) to consider [def: 3]initial_probmixtures::AbstractVector{Float64}
: Initial probabilities of the categorical distribution (n_classes x 1) [default:[]
]mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}
: An array (of lengthn_classes
) of the mixtures to employ (see the?GMM
module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisation_strategy
parameter is set to "gived". This parameter can also be given symply in term of a type. In this case it is automatically extended to a vector ofn_classes
mixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]
]tol::Float64
: Tolerance to stop the algorithm [default: 10^(-6)]minimum_variance::Float64
: Minimum variance for the mixtures [default: 0.05]minimum_covariance::Float64
: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).initialisation_strategy::String
: The computation method of the vector of the initial mixtures. One of the following:- "grid": using a grid approach
- "given": using the mixture provided in the fully qualified
mixtures
parameter - "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]
Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.
maximum_iterations::Int64
: Maximum number of iterations [def:typemax(Int64)
, i.e. ∞]rng::Random.AbstractRNG
: Random Number Generator [deafult:Random.GLOBAL_RNG
]
Example:
julia> using MLJ
julia> X, y = @load_iris;
julia> modelType = @load GaussianMixtureClusterer pkg = "BetaML" verbosity=0
BetaML.GMM.GaussianMixtureClusterer
julia> model = modelType()
GaussianMixtureClusterer(
n_classes = 3,
initial_probmixtures = Float64[],
mixtures = BetaML.GMM.DiagonalGaussian{Float64}[BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing), BetaML.GMM.DiagonalGaussian{Float64}(nothing, nothing)],
tol = 1.0e-6,
minimum_variance = 0.05,
minimum_covariance = 0.0,
initialisation_strategy = "kmeans",
maximum_iterations = 9223372036854775807,
rng = Random._GLOBAL_RNG())
julia> mach = machine(model, X);
julia> fit!(mach);
[ Info: Training machine(GaussianMixtureClusterer(n_classes = 3, …), …).
Iter. 1: Var. of the post 10.800150114964184 Log-likelihood -650.0186451891216
julia> classes_est = predict(mach, X)
150-element CategoricalDistributions.UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}:
UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>4.17e-15, 3=>2.1900000000000003e-31)
UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>1.25e-13, 3=>5.87e-31)
UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>4.5e-15, 3=>1.55e-32)
UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>6.93e-14, 3=>3.37e-31)
⋮
UnivariateFinite{Multiclass{3}}(1=>5.39e-25, 2=>0.0167, 3=>0.983)
UnivariateFinite{Multiclass{3}}(1=>7.5e-29, 2=>0.000106, 3=>1.0)
UnivariateFinite{Multiclass{3}}(1=>1.6e-20, 2=>0.594, 3=>0.406)