ProbabilisticSVC
ProbabilisticSVC
A model type for constructing a probabilistic C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.
From MLJ, the type can be imported using
ProbabilisticSVC = @load ProbabilisticSVC pkg=LIBSVM
Do model = ProbabilisticSVC()
to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSVC(kernel=...)
.
This model is identical to SVC
with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to the total computation time.
Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.
Training data
In MLJ or MLJBase, bind an instance model
to data with one of:
mach = machine(model, X, y)
mach = machine(model, X, y, w)
where
X
: any table of input features (eg, aDataFrame
) whose columns each haveContinuous
element scitype; check column scitypes withschema(X)
y
: is the target, which can be anyAbstractVector
whose element scitype is<:OrderedFactor
or<:Multiclass
; check the scitype withscitype(y)
w
: a dictionary of class weights, keyed onlevels(y)
.
Train the machine using fit!(mach, rows=...)
.
Hyper-parameters
kernel=LIBSVM.Kernel.RadialBasis
: either an object that can be called, as inkernel(x1, x2)
, or one of the built-in kernels from the LIBSVM.jl package listed below. Herex1
andx2
are vectors whose lengths match the number of columns of the training dataX
(see "Examples" below).LIBSVM.Kernel.Linear
:(x1, x2) -> x1'*x2
LIBSVM.Kernel.Polynomial
:(x1, x2) -> gamma*x1'*x2 + coef0)^degree
LIBSVM.Kernel.RadialBasis
:(x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
LIBSVM.Kernel.Sigmoid
:(x1, x2) - > tanh(gamma*x1'*x2 + coef0)
Here
gamma
,coef0
,degree
are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91gamma = 0.0
: kernel parameter (see above); ifgamma==-1.0
thengamma = 1/nfeatures
is used in training, wherenfeatures
is the number of features (columns ofX
). Ifgamma==0.0
thengamma = 1/(var(Tables.matrix(X))*nfeatures)
is used. Actual value used appears in the report (see below).coef0 = 0.0
: kernel parameter (see above)degree::Int32 = Int32(3)
: degree in polynomial kernel (see above)cost=1.0
(range (0,Inf
)): the parameter denoted $C$ in the cited reference; for greater regularization, decreasecost
cachesize=200.0
cache memory size in MBtolerance=0.001
: tolerance for the stopping criterionshrinking=true
: whether to use shrinking heuristics
Operations
predict(mach, Xnew)
: return probabilistic predictions of the target given featuresXnew
having the same scitype asX
above.
Fitted parameters
The fields of fitted_params(mach)
are:
libsvm_model
: the trained model object created by the LIBSVM.jl packageencoding
: class encoding used internally bylibsvm_model
- a dictionary of class labels keyed on the internal integer representation
Report
The fields of report(mach)
are:
gamma
: actual value of the kernel parametergamma
used in training
Examples
Using a built-in kernel
using MLJ
import LIBSVM
ProbabilisticSVC = @load ProbabilisticSVC pkg=LIBSVM ## model type
model = ProbabilisticSVC(kernel=LIBSVM.Kernel.Polynomial) ## instance
X, y = @load_iris ## table, vector
mach = machine(model, X, y) |> fit!
Xnew = (sepal_length = [6.4, 7.2, 7.4],
sepal_width = [2.8, 3.0, 2.8],
petal_length = [5.6, 5.8, 6.1],
petal_width = [2.1, 1.6, 1.9],)
julia> probs = predict(mach, Xnew)
3-element UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:
UnivariateFinite{Multiclass{3}}(setosa=>0.00186, versicolor=>0.003, virginica=>0.995)
UnivariateFinite{Multiclass{3}}(setosa=>0.000563, versicolor=>0.0554, virginica=>0.944)
UnivariateFinite{Multiclass{3}}(setosa=>1.4e-6, versicolor=>1.68e-6, virginica=>1.0)
julia> labels = mode.(probs)
3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"virginica"
"virginica"
"virginica"
User-defined kernels
k(x1, x2) = x1'*x2 ## equivalent to `LIBSVM.Kernel.Linear`
model = ProbabilisticSVC(kernel=k)
mach = machine(model, X, y) |> fit!
probs = predict(mach, Xnew)
Incorporating class weights
In either scenario above, we can do:
weights = Dict("virginica" => 1, "versicolor" => 20, "setosa" => 1)
mach = machine(model, X, y, weights) |> fit!
probs = predict(mach, Xnew)
See also the classifiers SVC
, NuSVC
and LinearSVC
, and LIVSVM.jl and the original C implementation documentation.