ConstantClassifier
ConstantClassifierThis "dummy" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution d returned is the UnivariateFinite distribution based on frequency of classes observed in the training target data. So, pdf(d, level) is the number of times the training target takes on the value level. Use predict_mode instead of predict to obtain the training target mode instead. For more on the UnivariateFinite type, see the CategoricalDistributions.jl package.
Almost any reasonable model is expected to outperform ConstantClassifier, which is used almost exclusively for testing and establishing performance baselines.
In MLJ (or MLJModels) do model = ConstantClassifier() to construct an instance.
Training data
In MLJ or MLJBase, bind an instance model to data with
mach = machine(model, X, y)Here:
Xis any table of input features (eg, aDataFrame)yis the target, which can be anyAbstractVectorwhose element scitype isFinite; check the scitype withschema(y)
Train the machine using fit!(mach, rows=...).
Hyper-parameters
None.
Operations
predict(mach, Xnew): Return predictions of the target given featuresXnew(which for this model are ignored). Predictions are probabilistic.predict_mode(mach, Xnew): Return the mode of the probabilistic predictions returned above.
Fitted parameters
The fields of fitted_params(mach) are:
target_distribution: The distribution fit to the supplied target data.
Examples
using MLJ
clf = ConstantClassifier()
X, y = @load_crabs ## a table and a categorical vector
mach = machine(clf, X, y) |> fit!
fitted_params(mach)
Xnew = (;FL = [8.1, 24.8, 7.2],
RW = [5.1, 25.7, 6.4],
CL = [15.9, 46.7, 14.3],
CW = [18.7, 59.7, 12.2],
BD = [6.2, 23.6, 8.4],)
## probabilistic predictions:
yhat = predict(mach, Xnew)
yhat[1]
## raw probabilities:
pdf.(yhat, "B")
## probability matrix:
L = levels(y)
pdf(yhat, L)
## point predictions:
predict_mode(mach, Xnew)See also ConstantRegressor