SVC
SVCA model type for constructing a C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.
From MLJ, the type can be imported using
SVC = @load SVC pkg=LIBSVMDo model = SVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVC(kernel=...).
This model predicts actual class labels. To predict probabilities, use instead ProbabilisticSVC.
Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.
Training data
In MLJ or MLJBase, bind an instance model to data with one of:
mach = machine(model, X, y)
mach = machine(model, X, y, w)where
- X: any table of input features (eg, a- DataFrame) whose columns each have- Continuouselement scitype; check column scitypes with- schema(X)
- y: is the target, which can be any- AbstractVectorwhose element scitype is- <:OrderedFactoror- <:Multiclass; check the scitype with- scitype(y)
- w: a dictionary of class weights, keyed on- levels(y).
Train the machine using fit!(mach, rows=...).
Hyper-parameters
- kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in- kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here- x1and- x2are vectors whose lengths match the number of columns of the training data- X(see "Examples" below).- LIBSVM.Kernel.Linear:- (x1, x2) -> x1'*x2
- LIBSVM.Kernel.Polynomial:- (x1, x2) -> gamma*x1'*x2 + coef0)^degree
- LIBSVM.Kernel.RadialBasis:- (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
- LIBSVM.Kernel.Sigmoid:- (x1, x2) - > tanh(gamma*x1'*x2 + coef0)
 - Here - gamma,- coef0,- degreeare other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91
- gamma = 0.0: kernel parameter (see above); if- gamma==-1.0then- gamma = 1/nfeaturesis used in training, where- nfeaturesis the number of features (columns of- X). If- gamma==0.0then- gamma = 1/(var(Tables.matrix(X))*nfeatures)is used. Actual value used appears in the report (see below).
- coef0 = 0.0: kernel parameter (see above)
- degree::Int32 = Int32(3): degree in polynomial kernel (see above)
- cost=1.0(range (0,- Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease- cost
- cachesize=200.0cache memory size in MB
- tolerance=0.001: tolerance for the stopping criterion
- shrinking=true: whether to use shrinking heuristics
Operations
- predict(mach, Xnew): return predictions of the target given features- Xnewhaving the same scitype as- Xabove.
Fitted parameters
The fields of fitted_params(mach) are:
- libsvm_model: the trained model object created by the LIBSVM.jl package
- encoding: class encoding used internally by- libsvm_model- a dictionary of class labels keyed on the internal integer representation
Report
The fields of report(mach) are:
- gamma: actual value of the kernel parameter- gammaused in training
Examples
Using a built-in kernel
using MLJ
import LIBSVM
SVC = @load SVC pkg=LIBSVM                   ## model type
model = SVC(kernel=LIBSVM.Kernel.Polynomial) ## instance
X, y = @load_iris ## table, vector
mach = machine(model, X, y) |> fit!
Xnew = (sepal_length = [6.4, 7.2, 7.4],
        sepal_width = [2.8, 3.0, 2.8],
        petal_length = [5.6, 5.8, 6.1],
        petal_width = [2.1, 1.6, 1.9],)
julia> yhat = predict(mach, Xnew)
3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
 "virginica"
 "virginica"
 "virginica"User-defined kernels
k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`
model = SVC(kernel=k)
mach = machine(model, X, y) |> fit!
julia> yhat = predict(mach, Xnew)
3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
 "virginica"
 "virginica"
 "virginica"Incorporating class weights
In either scenario above, we can do:
weights = Dict("virginica" => 1, "versicolor" => 20, "setosa" => 1)
mach = machine(model, X, y, weights) |> fit!
julia> yhat = predict(mach, Xnew)
3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
 "versicolor"
 "versicolor"
 "versicolor"See also the classifiers ProbabilisticSVC, NuSVC and LinearSVC. And see LIVSVM.jl and the original C implementation documentation.