FactorAnalysis
FactorAnalysis
A model type for constructing a factor analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.
From MLJ, the type can be imported using
FactorAnalysis = @load FactorAnalysis pkg=MultivariateStats
Do model = FactorAnalysis()
to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FactorAnalysis(method=...)
.
Factor analysis is a linear-Gaussian latent variable model that is closely related to probabilistic PCA. In contrast to the probabilistic PCA model, the covariance of conditional distribution of the observed variable given the latent variable is diagonal rather than isotropic.
Training data
In MLJ or MLJBase, bind an instance model
to data with
mach = machine(model, X)
Here:
X
is any table of input features (eg, aDataFrame
) whose columns are of scitypeContinuous
; check column scitypes withschema(X)
.
Train the machine using fit!(mach, rows=...)
.
Hyper-parameters
method::Symbol=:cm
: Method to use to solve the problem, one of:ml
,:em
,:bayes
.maxoutdim=0
: Controls the the dimension (number of columns) of the output,outdim
. Specifically,outdim = min(n, indim, maxoutdim)
, wheren
is the number of observations andindim
the input dimension.maxiter::Int=1000
: Maximum number of iterations.tol::Real=1e-6
: Convergence tolerance.eta::Real=tol
: Variance lower bound.mean::Union{Nothing, Real, Vector{Float64}}=nothing
: Ifnothing
, centering will be computed and applied; if set to0
no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.
Operations
transform(mach, Xnew)
: Return a lower dimensional projection of the inputXnew
, which should have the same scitype asX
above.inverse_transform(mach, Xsmall)
: For a dimension-reduced tableXsmall
, such as returned bytransform
, reconstruct a table, having same the number of columns as the original training dataX
, that transforms toXsmall
. Mathematically,inverse_transform
is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, ifXsmall = transform(mach, Xnew)
, theninverse_transform(Xsmall)
is only an approximation toXnew
.
Fitted parameters
The fields of fitted_params(mach)
are:
projection
: Returns the projection matrix, which has size(indim, outdim)
, whereindim
andoutdim
are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a factor.
Report
The fields of report(mach)
are:
indim
: Dimension (number of columns) of the training data and new data to be transformed.outdim
: Dimension of transformed data (number of factors).variance
: The variance of the factors.covariance_matrix
: The estimated covariance matrix.mean
: The mean of the untransformed training data, of lengthindim
.loadings
: The factor loadings. A matrix of size (indim
,outdim
) whereindim
andoutdim
are as defined above.
Examples
using MLJ
FactorAnalysis = @load FactorAnalysis pkg=MultivariateStats
X, y = @load_iris ## a table and a vector
model = FactorAnalysis(maxoutdim=2)
mach = machine(model, X) |> fit!
Xproj = transform(mach, X)
See also KernelPCA
, ICA
, PPCA
, PCA