SimpleImputer
mutable struct SimpleImputer <: MLJModelInterface.Unsupervised
Impute missing values using feature (column) mean, with optional record normalisation (using l-norm
norms), from the Beta Machine Learning Toolkit (BetaML).
Hyperparameters:
statistic::Function
: The descriptive statistic of the column (feature) to use as imputed value [def:mean
]norm::Union{Nothing, Int64}
: Normalise the feature mean by l-norm
norm of the records [default:nothing
]. Use it (e.g.norm=1
to use the l-1 norm) if the records are highly heterogeneus (e.g. quantity exports of different countries).
Example:
julia> using MLJ
julia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;
julia> modelType = @load SimpleImputer pkg = "BetaML" verbosity=0
BetaML.Imputation.SimpleImputer
julia> model = modelType(norm=1)
SimpleImputer(
statistic = Statistics.mean,
norm = 1)
julia> mach = machine(model, X);
julia> fit!(mach);
[ Info: Training machine(SimpleImputer(statistic = mean, …), …).
julia> X_full = transform(mach) |> MLJ.matrix
9×2 Matrix{Float64}:
1.0 10.5
1.5 0.295466
1.8 8.0
1.7 15.0
3.2 40.0
0.280952 1.69524
3.3 38.0
0.0750839 -2.3
5.2 -2.4