Feature Selection
For more on feature selection tools, refer to the FeatureSelection.jl documentation.
Reference
FeatureSelection.FeatureSelector
— TypeFeatureSelector
A model type for constructing a feature selector, based on unknown.jl, and implementing the MLJ model interface.
From MLJ, the type can be imported using
FeatureSelector = @load FeatureSelector pkg=unknown
Do model = FeatureSelector()
to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...)
.
Use this model to select features (columns) of a table, usually as part of a model Pipeline
.
Training data
In MLJ or MLJBase, bind an instance model
to data with
mach = machine(model, X)
where
X
: any table of input features, where "table" is in the sense of Tables.jl
Train the machine using fit!(mach, rows=...)
.
Hyper-parameters
features
: one of the following, with the behavior indicated:[]
(empty, the default): filter out all features (columns) which were not encountered in trainingnon-empty vector of feature names (symbols): keep only the specified features (
ignore=false
) or keep only unspecified features (ignore=true
)function or other callable: keep a feature if the callable returns
true
on its name. For example, specifyingFeatureSelector(features = name -> name in [:x1, :x3], ignore = true)
has the same effect asFeatureSelector(features = [:x1, :x3], ignore = true)
, namely to select all features, with the exception of:x1
and:x3
.
ignore
: whether to ignore or keep specifiedfeatures
, as explained above
Operations
transform(mach, Xnew)
: select features from the tableXnew
as specified by the model, taking features seen during training into account, if relevant
Fitted parameters
The fields of fitted_params(mach)
are:
features_to_keep
: the features that will be selected
Example
using MLJ
X = (ordinal1 = [1, 2, 3],
ordinal2 = coerce(["x", "y", "x"], OrderedFactor),
ordinal3 = [10.0, 20.0, 30.0],
ordinal4 = [-20.0, -30.0, -40.0],
nominal = coerce(["Your father", "he", "is"], Multiclass));
selector = FeatureSelector(features=[:ordinal3, ], ignore=true);
julia> transform(fit!(machine(selector, X)), X)
(ordinal1 = [1, 2, 3],
ordinal2 = CategoricalValue{Symbol,UInt32}["x", "y", "x"],
ordinal4 = [-20.0, -30.0, -40.0],
nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)
FeatureSelection.RecursiveFeatureElimination
— FunctionRecursiveFeatureElimination(model; n_features=0, step=1)
This model implements a recursive feature elimination algorithm for feature selection. It recursively removes features, training a base model on the remaining features and evaluating their importance until the desired number of features is selected.
Training data
In MLJ or MLJBase, bind an instance rfe_model
to data with
mach = machine(rfe_model, X, y)
OR, if the base model supports weights, as
mach = machine(rfe_model, X, y, w)
Here:
X
is any table of input features (eg, aDataFrame
) whose columns are of the scitype as that required by the base model; check column scitypes withschema(X)
and column scitypes required by base model withinput_scitype(basemodel)
.y
is the target, which can be any table of responses whose element scitype isContinuous
orFinite
depending on thetarget_scitype
required by the base model; check the scitype withscitype(y)
.w
is the observation weights which can either benothing
(default) or anAbstractVector
whoose element scitype isCount
orContinuous
. This is different fromweights
kernel which is an hyperparameter to the model, see below.
Train the machine using fit!(mach, rows=...)
.
Hyper-parameters
model: A base model with a
fit
method that provides information on feature feature importance (i.ereports_feature_importances(model) == true
)n_features::Real = 0: The number of features to select. If
0
, half of the features are selected. If a positive integer, the parameter is the absolute number of features to select. If a real number between 0 and 1, it is the fraction of features to select.step::Real=1: If the value of step is at least 1, it signifies the quantity of features to eliminate in each iteration. Conversely, if step falls strictly within the range of 0.0 to 1.0, it denotes the proportion (rounded down) of features to remove during each iteration.
Operations
transform(mach, X)
: transform the input tableX
into a new table containing only columns corresponding to features accepted by the RFE algorithm.predict(mach, X)
: transform the input tableX
into a new table same as intransform(mach, X)
above and predict using the fitted base model on the transformed table.
Fitted parameters
The fields of fitted_params(mach)
are:
features_left
: names of features remaining after recursive feature elimination.model_fitresult
: fitted parameters of the base model.
Report
The fields of report(mach)
are:
scores
: dictionary of scores for each feature in the training dataset. The model deems highly scored variables more significant.model_report
: report for the fitted base model.
Examples
The following example assumes you have MLJDecisionTreeInterface in the active package ennvironment.
using MLJ
RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree
# Creates a dataset where the target only depends on the first 5 columns of the input table.
A = rand(50, 10);
y = 10 .* sin.(
pi .* A[:, 1] .* A[:, 2]
) + 20 .* (A[:, 3] .- 0.5).^ 2 .+ 10 .* A[:, 4] .+ 5 * A[:, 5];
X = MLJ.table(A);
# fit a rfe model:
rf = RandomForestRegressor()
selector = RecursiveFeatureElimination(rf, n_features=2)
mach = machine(selector, X, y)
fit!(mach)
# view the feature importances
feature_importances(mach)
# predict using the base model trained on the reduced feature set:
Xnew = MLJ.table(rand(50, 10));
predict(mach, Xnew)
# transform data with all features to the reduced feature set:
transform(mach, Xnew)