ConstantClassifier
ConstantClassifier
This "dummy" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution d
returned is the UnivariateFinite
distribution based on frequency of classes observed in the training target data. So, pdf(d, level)
is the number of times the training target takes on the value level
. Use predict_mode
instead of predict
to obtain the training target mode instead. For more on the UnivariateFinite
type, see the CategoricalDistributions.jl package.
Almost any reasonable model is expected to outperform ConstantClassifier
, which is used almost exclusively for testing and establishing performance baselines.
In MLJ (or MLJModels) do model = ConstantClassifier()
to construct an instance.
Training data
In MLJ or MLJBase, bind an instance model
to data with
mach = machine(model, X, y)
Here:
X
is any table of input features (eg, aDataFrame
)y
is the target, which can be anyAbstractVector
whose element scitype isFinite
; check the scitype withschema(y)
Train the machine using fit!(mach, rows=...)
.
Hyper-parameters
None.
Operations
predict(mach, Xnew)
: Return predictions of the target given featuresXnew
(which for this model are ignored). Predictions are probabilistic.predict_mode(mach, Xnew)
: Return the mode of the probabilistic predictions returned above.
Fitted parameters
The fields of fitted_params(mach)
are:
target_distribution
: The distribution fit to the supplied target data.
Examples
using MLJ
clf = ConstantClassifier()
X, y = @load_crabs ## a table and a categorical vector
mach = machine(clf, X, y) |> fit!
fitted_params(mach)
Xnew = (;FL = [8.1, 24.8, 7.2],
RW = [5.1, 25.7, 6.4],
CL = [15.9, 46.7, 14.3],
CW = [18.7, 59.7, 12.2],
BD = [6.2, 23.6, 8.4],)
## probabilistic predictions:
yhat = predict(mach, Xnew)
yhat[1]
## raw probabilities:
pdf.(yhat, "B")
## probability matrix:
L = levels(y)
pdf(yhat, L)
## point predictions:
predict_mode(mach, Xnew)
See also ConstantRegressor