LGBMClassifier

LGBMClassifier

A model type for constructing a LightGBM classifier, based on LightGBM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LGBMClassifier = @load LGBMClassifier pkg=LightGBM

Do model = LGBMClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LGBMClassifier(objective=...).

`LightGBM, short for light gradient-boosting machine, is a framework for gradient boosting based on decision tree algorithms and used for classification and other machine learning tasks, with a focus on performance and scalability. This model in particular is used for various types of classification tasks.

Training data In MLJ or MLJBase, bind an instance `model` to data with

mach = machine(model, X, y)

Here:

X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X); alternatively, X is any AbstractMatrix with Continuous elements; check the scitype with scitype(X).
y is a vector of targets whose items are of scitype Continuous. Check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Operations

predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.

Hyper-parameters

See https://lightgbm.readthedocs.io/en/v3.3.5/Parameters.html.

Fitted parameters

The fields of fitted_params(mach) are:

fitresult: Fitted model information, contains a LGBMClassification object, a CategoricalArray of the input class names, and the classifier with all its parameters

Report

The fields of report(mach) are:

training_metrics: A dictionary containing all training metrics.
importance: A namedtuple containing:
- gain: The total gain of each split used by the model
- split: The number of times each feature is used by the model.

Examples


using DataFrames
using MLJ

## load the model
LGBMClassifier = @load LGBMClassifier pkg=LightGBM 

X, y = @load_iris 
X = DataFrame(X)
train, test = partition(collect(eachindex(y)), 0.70, shuffle=true)

first(X, 3)
lgb = LGBMClassifier() ## initialise a model with default params
mach = machine(lgb, X[train, :], y[train]) |> fit!

predict(mach, X[test, :])

## access feature importances
model_report = report(mach)
gain_importance = model_report.importance.gain
split_importance = model_report.importance.split

See also LightGBM.jl and the unwrapped model type LightGBM.LGBMClassification