OneRuleClassifier

OneRuleClassifier

A model type for constructing a one rule classifier, based on OneRule.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneRuleClassifier = @load OneRuleClassifier pkg=OneRule

Do model = OneRuleClassifier() to construct an instance with default hyper-parameters.

OneRuleClassifier implements the OneRule method for classification by Robert Holte ("Very simple classification rules perform well on most commonly used datasets" in: Machine Learning 11.1 (1993), pp. 63-90).

For more information see:

Training data

In MLJ or MLJBase, bind an instance model to data with

`mach = machine(model, X, y)``

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Multiclass, OrderedFactor, or <:Finite; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

This classifier has no hyper-parameters.

Operations

  • predict(mach, Xnew): return (deterministic) predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: the tree (a OneTree) returned by the core OneTree.jl algorithm
  • all_classes: all classes (i.e. levels) of the target (used also internally to transfer levels-information to predict)

Report

The fields of report(mach) are:

  • tree: The OneTree created based on the training data
  • nrules: The number of rules tree contains
  • error_rate: fraction of wrongly classified instances
  • error_count: number of wrongly classified instances
  • classes_seen: list of target classes actually observed in training
  • features: the names of the features encountered in training

Examples

using MLJ

ORClassifier = @load OneRuleClassifier pkg=OneRule

orc = ORClassifier()

outlook = ["sunny", "sunny", "overcast", "rainy", "rainy", "rainy", "overcast", "sunny", "sunny", "rainy",  "sunny", "overcast", "overcast", "rainy"]
temperature = ["hot", "hot", "hot", "mild", "cool", "cool", "cool", "mild", "cool", "mild", "mild", "mild", "hot", "mild"]
humidity = ["high", "high", "high", "high", "normal", "normal", "normal", "high", "normal", "normal", "normal", "high", "normal", "high"]
windy = ["false", "true", "false", "false", "false", "true", "true", "false", "false", "false", "true", "true", "false", "true"]

weather_data = (outlook = outlook, temperature = temperature, humidity = humidity, windy = windy)
play_data = ["no", "no", "yes", "yes", "yes", "no", "yes", "no", "yes", "yes", "yes", "yes", "yes", "no"]

weather = coerce(weather_data, Textual => Multiclass)
play = coerce(play, Multiclass)

mach = machine(orc, weather, play)
fit!(mach)

yhat = MLJ.predict(mach, weather)       ## in a real context 'new' `weather` data would be used
one_tree = fitted_params(mach).tree
report(mach).error_rate

See also OneRule.jl.