Feature Selection

For more on feature selection tools, refer to the FeatureSelection.jl documentation.

Reference

FeatureSelection.FeatureSelectorType
FeatureSelector

A model type for constructing a feature selector, based on unknown.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FeatureSelector = @load FeatureSelector pkg=unknown

Do model = FeatureSelector() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...).

Use this model to select features (columns) of a table, usually as part of a model Pipeline.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features, where "table" is in the sense of Tables.jl

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated:

    • [] (empty, the default): filter out all features (columns) which were not encountered in training

    • non-empty vector of feature names (symbols): keep only the specified features (ignore=false) or keep only unspecified features (ignore=true)

    • function or other callable: keep a feature if the callable returns true on its name. For example, specifying FeatureSelector(features = name -> name in [:x1, :x3], ignore = true) has the same effect as FeatureSelector(features = [:x1, :x3], ignore = true), namely to select all features, with the exception of :x1 and :x3.

  • ignore: whether to ignore or keep specified features, as explained above

Operations

  • transform(mach, Xnew): select features from the table Xnew as specified by the model, taking features seen during training into account, if relevant

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: the features that will be selected

Example

using MLJ

X = (ordinal1 = [1, 2, 3],
     ordinal2 = coerce(["x", "y", "x"], OrderedFactor),
     ordinal3 = [10.0, 20.0, 30.0],
     ordinal4 = [-20.0, -30.0, -40.0],
     nominal = coerce(["Your father", "he", "is"], Multiclass));

selector = FeatureSelector(features=[:ordinal3, ], ignore=true);

julia> transform(fit!(machine(selector, X)), X)
(ordinal1 = [1, 2, 3],
 ordinal2 = CategoricalValue{Symbol,UInt32}["x", "y", "x"],
 ordinal4 = [-20.0, -30.0, -40.0],
 nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)
source
FeatureSelection.RecursiveFeatureEliminationFunction
RecursiveFeatureElimination(model; n_features=0, step=1)

This model implements a recursive feature elimination algorithm for feature selection. It recursively removes features, training a base model on the remaining features and evaluating their importance until the desired number of features is selected.

Training data

In MLJ or MLJBase, bind an instance rfe_model to data with

mach = machine(rfe_model, X, y)

OR, if the base model supports weights, as

mach = machine(rfe_model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of the scitype as that required by the base model; check column scitypes with schema(X) and column scitypes required by base model with input_scitype(basemodel).

  • y is the target, which can be any table of responses whose element scitype is Continuous or Finite depending on the target_scitype required by the base model; check the scitype with scitype(y).

  • w is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • model: A base model with a fit method that provides information on feature feature importance (i.e reports_feature_importances(model) == true)

  • n_features::Real = 0: The number of features to select. If 0, half of the features are selected. If a positive integer, the parameter is the absolute number of features to select. If a real number between 0 and 1, it is the fraction of features to select.

  • step::Real=1: If the value of step is at least 1, it signifies the quantity of features to eliminate in each iteration. Conversely, if step falls strictly within the range of 0.0 to 1.0, it denotes the proportion (rounded down) of features to remove during each iteration.

Operations

  • transform(mach, X): transform the input table X into a new table containing only columns corresponding to features accepted by the RFE algorithm.

  • predict(mach, X): transform the input table X into a new table same as in transform(mach, X) above and predict using the fitted base model on the transformed table.

Fitted parameters

The fields of fitted_params(mach) are:

  • features_left: names of features remaining after recursive feature elimination.

  • model_fitresult: fitted parameters of the base model.

Report

The fields of report(mach) are:

  • scores: dictionary of scores for each feature in the training dataset. The model deems highly scored variables more significant.

  • model_report: report for the fitted base model.

Examples

The following example assumes you have MLJDecisionTreeInterface in the active package ennvironment.

using MLJ

RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree

# Creates a dataset where the target only depends on the first 5 columns of the input table.
A = rand(50, 10);
y = 10 .* sin.(
        pi .* A[:, 1] .* A[:, 2]
    ) + 20 .* (A[:, 3] .- 0.5).^ 2 .+ 10 .* A[:, 4] .+ 5 * A[:, 5];
X = MLJ.table(A);

# fit a rfe model:
rf = RandomForestRegressor()
selector = RecursiveFeatureElimination(rf, n_features=2)
mach = machine(selector, X, y)
fit!(mach)

# view the feature importances
feature_importances(mach)

# predict using the base model trained on the reduced feature set:
Xnew = MLJ.table(rand(50, 10));
predict(mach, Xnew)

# transform data with all features to the reduced feature set:
transform(mach, Xnew)
source