LinearSVC

LinearSVC

A model type for constructing a linear support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearSVC = @load LinearSVC pkg=LIBSVM

Do model = LinearSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearSVC(solver=...).

Reference for algorithm and core C-library: Rong-En Fan et al (2008): "LIBLINEAR: A Library for Large Linear Classification." Journal of Machine Learning Research 9 1871-1874. Available at https://www.csie.ntu.edu.tw/~cjlin/papers/liblinear.pdf.

This model type is similar to SVC from the same package with the setting kernel=LIBSVM.Kernel.KERNEL.Linear, but is optimized for the linear case.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
mach = machine(model, X, y, w)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)
  • w: a dictionary of class weights, keyed on levels(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • solver=LIBSVM.Linearsolver.L2R_L2LOSS_SVC_DUAL: linear solver, which must be one of the following from the LIBSVM.jl package:

    • LIBSVM.Linearsolver.L2R_LR: L2-regularized logistic regression (primal))
    • LIBSVM.Linearsolver.L2R_L2LOSS_SVC_DUAL: L2-regularized L2-loss support vector classification (dual)
    • LIBSVM.Linearsolver.L2R_L2LOSS_SVC: L2-regularized L2-loss support vector classification (primal)
    • LIBSVM.Linearsolver.L2R_L1LOSS_SVC_DUAL: L2-regularized L1-loss support vector classification (dual)
    • LIBSVM.Linearsolver.MCSVM_CS: support vector classification by Crammer and Singer) LIBSVM.Linearsolver.L1R_L2LOSS_SVC: L1-regularized L2-loss support vector classification)
    • LIBSVM.Linearsolver.L1R_LR: L1-regularized logistic regression
    • LIBSVM.Linearsolver.L2R_LR_DUAL: L2-regularized logistic regression (dual)
  • tolerance::Float64=Inf: tolerance for the stopping criterion;

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • bias= -1.0: if bias >= 0, instance x becomes [x; bias]; if bias < 0, no bias term added (default -1)

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Examples

using MLJ
import LIBSVM

LinearSVC = @load LinearSVC pkg=LIBSVM               ## model type
model = LinearSVC(solver=LIBSVM.Linearsolver.L2R_LR) ## instance

X, y = @load_iris ## table, vector
mach = machine(model, X, y) |> fit!

Xnew = (sepal_length = [6.4, 7.2, 7.4],
        sepal_width = [2.8, 3.0, 2.8],
        petal_length = [5.6, 5.8, 6.1],
        petal_width = [2.1, 1.6, 1.9],)

julia> yhat = predict(mach, Xnew)
3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
 "virginica"
 "versicolor"
 "virginica"

Incorporating class weights

weights = Dict("virginica" => 1, "versicolor" => 20, "setosa" => 1)
mach = machine(model, X, y, weights) |> fit!

julia> yhat = predict(mach, Xnew)
3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
 "versicolor"
 "versicolor"
 "versicolor"

See also the SVC and NuSVC classifiers, and LIVSVM.jl and the original C implementation documentation.