Breast Cancer Wisconsin(Diagnostic)
To ensure code in this tutorial runs as shown, download the tutorial project folder and follow these instructions.If you have questions or suggestions about this tutorial, please open an issue here.
This tutorial covers programmatic model selection on the popular "Breast Cancer Wisconsin (Diagnostic) Data Set" from the UCI archives. The tutorial also covers basic data preprocessing and usage of MLJ Scientific Types.
using UrlDownload
using DataFrames
using MLJ
using StatsBase
using StableRNGs # for an RNG stable across julia versions
Using the package UrlDownload.jl, we can capture the data from the given link using the below commands.
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data";
feature_names = ["ID", "Class", "mean radius", "mean texture", "mean perimeter", "mean area", "mean smoothness", "mean compactness", "mean concavity", "mean concave points", "mean symmetry", "mean fractal dimension", "radius error", "texture error", "perimeter error", "area error", "smoothness error", "compactness error", "concavity error", "concave points error", "symmetry error", "fractal dimension error", "worst radius", "worst texture", "worst perimeter", "worst area", "worst smoothness", "worst compactness", "worst concavity", "worst concave points", "worst symmetry", "worst fractal dimension"]
data = urldownload(url, true, format = :CSV, header = feature_names);
using Plots
Plots.bar(countmap(data.Class), legend=false,)
xlabel!("Classes")
ylabel!("Number of samples")
df = DataFrame(data)[:, 2:end];
Printing the 1st 10 rows so as to get a visual idea about the type of data we're dealing with
first(df,10)10×31 DataFrame
 Row │ Class    mean radius  mean texture  mean perimeter  mean area  mean smoothness  mean compactness  mean concavity  mean concave points  mean symmetry  mean fractal dimension  radius error  texture error  perimeter error  area error  smoothness error  compactness error  concavity error  concave points error  symmetry error  fractal dimension error  worst radius  worst texture  worst perimeter  worst area  worst smoothness  worst compactness  worst concavity  worst concave points  worst symmetry  worst fractal dimension
     │ String1  Float64      Float64       Float64         Float64    Float64          Float64           Float64         Float64              Float64        Float64                 Float64       Float64        Float64          Float64     Float64           Float64            Float64          Float64               Float64         Float64                  Float64       Float64        Float64          Float64     Float64           Float64            Float64          Float64               Float64         Float64
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ M              17.99         10.38          122.8      1001.0          0.1184            0.2776          0.3001               0.1471          0.2419                 0.07871        1.095          0.9053            8.589      153.4           0.006399            0.04904          0.05373               0.01587         0.03003                 0.006193         25.38          17.33           184.6       2019.0            0.1622             0.6656           0.7119                0.2654          0.4601                  0.1189
   2 │ M              20.57         17.77          132.9      1326.0          0.08474           0.07864         0.0869               0.07017         0.1812                 0.05667        0.5435         0.7339            3.398       74.08          0.005225            0.01308          0.0186                0.0134          0.01389                 0.003532         24.99          23.41           158.8       1956.0            0.1238             0.1866           0.2416                0.186           0.275                   0.08902
   3 │ M              19.69         21.25          130.0      1203.0          0.1096            0.1599          0.1974               0.1279          0.2069                 0.05999        0.7456         0.7869            4.585       94.03          0.00615             0.04006          0.03832               0.02058         0.0225                  0.004571         23.57          25.53           152.5       1709.0            0.1444             0.4245           0.4504                0.243           0.3613                  0.08758
   4 │ M              11.42         20.38           77.58      386.1          0.1425            0.2839          0.2414               0.1052          0.2597                 0.09744        0.4956         1.156             3.445       27.23          0.00911             0.07458          0.05661               0.01867         0.05963                 0.009208         14.91          26.5             98.87       567.7            0.2098             0.8663           0.6869                0.2575          0.6638                  0.173
   5 │ M              20.29         14.34          135.1      1297.0          0.1003            0.1328          0.198                0.1043          0.1809                 0.05883        0.7572         0.7813            5.438       94.44          0.01149             0.02461          0.05688               0.01885         0.01756                 0.005115         22.54          16.67           152.2       1575.0            0.1374             0.205            0.4                   0.1625          0.2364                  0.07678
   6 │ M              12.45         15.7            82.57      477.1          0.1278            0.17            0.1578               0.08089         0.2087                 0.07613        0.3345         0.8902            2.217       27.19          0.00751             0.03345          0.03672               0.01137         0.02165                 0.005082         15.47          23.75           103.4        741.6            0.1791             0.5249           0.5355                0.1741          0.3985                  0.1244
   7 │ M              18.25         19.98          119.6      1040.0          0.09463           0.109           0.1127               0.074           0.1794                 0.05742        0.4467         0.7732            3.18        53.91          0.004314            0.01382          0.02254               0.01039         0.01369                 0.002179         22.88          27.66           153.2       1606.0            0.1442             0.2576           0.3784                0.1932          0.3063                  0.08368
   8 │ M              13.71         20.83           90.2       577.9          0.1189            0.1645          0.09366              0.05985         0.2196                 0.07451        0.5835         1.377             3.856       50.96          0.008805            0.03029          0.02488               0.01448         0.01486                 0.005412         17.06          28.14           110.6        897.0            0.1654             0.3682           0.2678                0.1556          0.3196                  0.1151
   9 │ M              13.0          21.82           87.5       519.8          0.1273            0.1932          0.1859               0.09353         0.235                  0.07389        0.3063         1.002             2.406       24.32          0.005731            0.03502          0.03553               0.01226         0.02143                 0.003749         15.49          30.73           106.2        739.3            0.1703             0.5401           0.539                 0.206           0.4378                  0.1072
  10 │ M              12.46         24.04           83.97      475.9          0.1186            0.2396          0.2273               0.08543         0.203                  0.08243        0.2976         1.599             2.039       23.94          0.007149            0.07217          0.07743               0.01432         0.01789                 0.01008          15.09          40.68            97.65       711.4            0.1853             1.058            1.105                 0.221           0.4366                  0.2075
For checking the statistical attributes of each inividual feature, we can use the decsribe() method
describe(df)31×7 DataFrame
 Row │ variable                 mean        min        median    max      nmissing  eltype
     │ Symbol                   Union…      Any        Union…    Any      Int64     DataType
─────┼───────────────────────────────────────────────────────────────────────────────────────
   1 │ Class                                B                    M               0  String1
   2 │ mean radius              14.1273     6.981      13.37     28.11           0  Float64
   3 │ mean texture             19.2896     9.71       18.84     39.28           0  Float64
   4 │ mean perimeter           91.969      43.79      86.24     188.5           0  Float64
   5 │ mean area                654.889     143.5      551.1     2501.0          0  Float64
   6 │ mean smoothness          0.0963603   0.05263    0.09587   0.1634          0  Float64
   7 │ mean compactness         0.104341    0.01938    0.09263   0.3454          0  Float64
   8 │ mean concavity           0.0887993   0.0        0.06154   0.4268          0  Float64
   9 │ mean concave points      0.0489191   0.0        0.0335    0.2012          0  Float64
  10 │ mean symmetry            0.181162    0.106      0.1792    0.304           0  Float64
  11 │ mean fractal dimension   0.0627976   0.04996    0.06154   0.09744         0  Float64
  12 │ radius error             0.405172    0.1115     0.3242    2.873           0  Float64
  13 │ texture error            1.21685     0.3602     1.108     4.885           0  Float64
  14 │ perimeter error          2.86606     0.757      2.287     21.98           0  Float64
  15 │ area error               40.3371     6.802      24.53     542.2           0  Float64
  16 │ smoothness error         0.00704098  0.001713   0.00638   0.03113         0  Float64
  17 │ compactness error        0.0254781   0.002252   0.02045   0.1354          0  Float64
  18 │ concavity error          0.0318937   0.0        0.02589   0.396           0  Float64
  19 │ concave points error     0.0117961   0.0        0.01093   0.05279         0  Float64
  20 │ symmetry error           0.0205423   0.007882   0.01873   0.07895         0  Float64
  21 │ fractal dimension error  0.0037949   0.0008948  0.003187  0.02984         0  Float64
  22 │ worst radius             16.2692     7.93       14.97     36.04           0  Float64
  23 │ worst texture            25.6772     12.02      25.41     49.54           0  Float64
  24 │ worst perimeter          107.261     50.41      97.66     251.2           0  Float64
  25 │ worst area               880.583     185.2      686.5     4254.0          0  Float64
  26 │ worst smoothness         0.132369    0.07117    0.1313    0.2226          0  Float64
  27 │ worst compactness        0.254265    0.02729    0.2119    1.058           0  Float64
  28 │ worst concavity          0.272188    0.0        0.2267    1.252           0  Float64
  29 │ worst concave points     0.114606    0.0        0.09993   0.291           0  Float64
  30 │ worst symmetry           0.290076    0.1565     0.2822    0.6638          0  Float64
  31 │ worst fractal dimension  0.0839458   0.05504    0.08004   0.2075          0  Float64
As we can see the feature set consists of varying features that have different ranges and quantiles. This can cause trouble for the optimization techniques and might cause convergence issues. We can use a feature scaling technique like Standardizer() to handle this.
But first, let's handle the scientific types of all the features. We can use the schema() method from MLJ.jl package to do this
schema(df)┌─────────────────────────┬────────────┬─────────┐
│ names                   │ scitypes   │ types   │
├─────────────────────────┼────────────┼─────────┤
│ Class                   │ Textual    │ String1 │
│ mean radius             │ Continuous │ Float64 │
│ mean texture            │ Continuous │ Float64 │
│ mean perimeter          │ Continuous │ Float64 │
│ mean area               │ Continuous │ Float64 │
│ mean smoothness         │ Continuous │ Float64 │
│ mean compactness        │ Continuous │ Float64 │
│ mean concavity          │ Continuous │ Float64 │
│ mean concave points     │ Continuous │ Float64 │
│ mean symmetry           │ Continuous │ Float64 │
│ mean fractal dimension  │ Continuous │ Float64 │
│ radius error            │ Continuous │ Float64 │
│ texture error           │ Continuous │ Float64 │
│ perimeter error         │ Continuous │ Float64 │
│ area error              │ Continuous │ Float64 │
│ smoothness error        │ Continuous │ Float64 │
│ compactness error       │ Continuous │ Float64 │
│ concavity error         │ Continuous │ Float64 │
│ concave points error    │ Continuous │ Float64 │
│ symmetry error          │ Continuous │ Float64 │
│ fractal dimension error │ Continuous │ Float64 │
│ worst radius            │ Continuous │ Float64 │
│ worst texture           │ Continuous │ Float64 │
│ worst perimeter         │ Continuous │ Float64 │
│ worst area              │ Continuous │ Float64 │
│ worst smoothness        │ Continuous │ Float64 │
│ worst compactness       │ Continuous │ Float64 │
│ worst concavity         │ Continuous │ Float64 │
│ worst concave points    │ Continuous │ Float64 │
│ worst symmetry          │ Continuous │ Float64 │
│ worst fractal dimension │ Continuous │ Float64 │
└─────────────────────────┴────────────┴─────────┘
As Textual is a sciytype reserved for text data "with sentiment", we need to coerce the scitype to the more appropriate OrderedFactor:
coerce!(df, :Class => OrderedFactor{2});
scitype(df.Class)AbstractVector{OrderedFactor{2}} (alias for AbstractArray{ScientificTypesBase.OrderedFactor{2}, 1})
Now that our data is fully processed, we can separate the target variable 'y' from the feature set 'X' using the unpack() method.
rng = StableRNG(123)
y, X = unpack(df, ==(:Class); rng);
We'll be using 80% of data for training, and can perform a train-test split using the partition method:
train, test = partition(eachindex(y), 0.8; rng)([281, 534, 44, 524, 554, 470, 50, 199, 295, 513, 569, 156, 176, 30, 404, 185, 307, 92, 373, 403, 9, 333, 210, 488, 79, 539, 561, 151, 366, 492, 178, 221, 209, 261, 43, 10, 475, 472, 352, 336, 407, 111, 275, 411, 90, 486, 390, 334, 549, 7, 421, 229, 415, 16, 257, 274, 160, 192, 510, 474, 310, 86, 114, 428, 72, 532, 317, 33, 558, 183, 36, 59, 489, 288, 551, 507, 170, 38, 144, 23, 31, 135, 456, 358, 252, 424, 223, 77, 203, 357, 158, 482, 224, 487, 303, 304, 434, 14, 251, 550, 149, 408, 268, 253, 244, 39, 519, 351, 491, 493, 538, 239, 112, 218, 238, 222, 546, 473, 356, 448, 517, 37, 213, 102, 41, 227, 168, 560, 47, 266, 327, 406, 480, 452, 143, 80, 361, 234, 469, 109, 173, 506, 365, 396, 541, 55, 246, 164, 372, 540, 495, 413, 61, 207, 374, 27, 189, 545, 457, 12, 188, 154, 468, 446, 471, 264, 494, 343, 236, 548, 335, 4, 350, 412, 103, 249, 430, 20, 69, 348, 186, 116, 65, 159, 146, 232, 128, 522, 313, 150, 233, 96, 113, 504, 405, 57, 445, 523, 435, 179, 191, 293, 202, 371, 329, 320, 544, 402, 139, 3, 119, 214, 215, 410, 130, 278, 325, 265, 153, 120, 375, 171, 53, 88, 70, 339, 376, 379, 19, 100, 105, 512, 364, 22, 535, 453, 437, 337, 349, 67, 99, 508, 83, 117, 305, 323, 226, 526, 399, 341, 204, 71, 565, 377, 414, 419, 163, 289, 467, 131, 297, 8, 206, 463, 441, 294, 398, 177, 477, 52, 18, 93, 260, 431, 145, 443, 250, 84, 87, 483, 400, 520, 94, 211, 568, 108, 290, 383, 490, 511, 107, 444, 422, 433, 389, 152, 362, 353, 49, 360, 240, 529, 311, 552, 543, 368, 462, 157, 432, 78, 66, 454, 393, 563, 104, 429, 201, 369, 259, 24, 367, 97, 40, 499, 449, 387, 76, 219, 296, 417, 17, 248, 292, 527, 32, 241, 308, 300, 409, 58, 394, 465, 95, 395, 81, 392, 62, 440, 380, 98, 167, 15, 442, 316, 148, 284, 500, 484, 243, 271, 386, 136, 194, 29, 514, 359, 230, 450, 122, 25, 464, 279, 556, 129, 416, 89, 542, 235, 322, 537, 285, 458, 60, 332, 459, 321, 461, 547, 85, 553, 426, 42, 237, 283, 401, 91, 138, 200, 231, 272, 439, 478, 326, 11, 518, 181, 174, 255, 505, 515, 509, 273, 280, 496, 460, 331, 344, 466, 567, 64, 438, 262, 299, 34, 75, 54, 263, 267, 205, 277, 328, 63, 525, 342, 388, 82, 220, 397, 225, 502, 169, 172, 126, 133, 115, 370, 291, 45, 74, 298, 503, 1, 338, 346, 256], [56, 536, 557, 182, 455, 35, 198, 481, 309, 282, 6, 124, 423, 347, 141, 562, 101, 498, 212, 118, 28, 533, 516, 48, 134, 391, 193, 132, 564, 197, 180, 187, 378, 385, 155, 276, 286, 476, 319, 190, 381, 479, 306, 217, 345, 137, 501, 287, 161, 354, 142, 247, 555, 208, 127, 330, 228, 110, 340, 147, 302, 485, 254, 270, 245, 447, 26, 427, 318, 162, 5, 363, 315, 216, 528, 314, 530, 21, 68, 436, 51, 242, 121, 2, 165, 195, 196, 497, 269, 566, 73, 451, 382, 559, 13, 312, 301, 324, 521, 531, 384, 420, 355, 418, 125, 123, 106, 258, 166, 46, 140, 425, 184, 175])
Now that our feature set is separated from the target variable, we can use theStandardizer() worklow to obtain to standardize our feature set X:
transformer_instance = Standardizer()
transformer_model = machine(transformer_instance, X[train,:])
fit!(transformer_model)
X = MLJ.transform(transformer_model, X);
With feature scaling complete, we are ready to compare the performance of various machine learning models for classification.
Now that we have separate training and testing set, let's see the models compatible with our data!
models(matching(X, y))55-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
 (name = AdaBoostClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = AdaBoostStumpClassifier, package_name = DecisionTree, ... )
 (name = BaggingClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = BayesianLDA, package_name = MLJScikitLearnInterface, ... )
 (name = BayesianLDA, package_name = MultivariateStats, ... )
 (name = BayesianQDA, package_name = MLJScikitLearnInterface, ... )
 (name = BayesianSubspaceLDA, package_name = MultivariateStats, ... )
 (name = CatBoostClassifier, package_name = CatBoost, ... )
 (name = ConstantClassifier, package_name = MLJModels, ... )
 (name = DecisionTreeClassifier, package_name = BetaML, ... )
 (name = DecisionTreeClassifier, package_name = DecisionTree, ... )
 (name = DeterministicConstantClassifier, package_name = MLJModels, ... )
 (name = DummyClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = EvoTreeClassifier, package_name = EvoTrees, ... )
 (name = ExtraTreesClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = GaussianNBClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = GaussianNBClassifier, package_name = NaiveBayes, ... )
 (name = GaussianProcessClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = GradientBoostingClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = HistGradientBoostingClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = KNNClassifier, package_name = NearestNeighborModels, ... )
 (name = KNeighborsClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = KernelPerceptronClassifier, package_name = BetaML, ... )
 (name = LDA, package_name = MultivariateStats, ... )
 (name = LGBMClassifier, package_name = LightGBM, ... )
 (name = LinearBinaryClassifier, package_name = GLM, ... )
 (name = LinearSVC, package_name = LIBSVM, ... )
 (name = LogisticCVClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = LogisticClassifier, package_name = MLJLinearModels, ... )
 (name = LogisticClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = MultinomialClassifier, package_name = MLJLinearModels, ... )
 (name = NeuralNetworkClassifier, package_name = BetaML, ... )
 (name = NeuralNetworkClassifier, package_name = MLJFlux, ... )
 (name = NuSVC, package_name = LIBSVM, ... )
 (name = PassiveAggressiveClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = PegasosClassifier, package_name = BetaML, ... )
 (name = PerceptronClassifier, package_name = BetaML, ... )
 (name = PerceptronClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = ProbabilisticNuSVC, package_name = LIBSVM, ... )
 (name = ProbabilisticSGDClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = ProbabilisticSVC, package_name = LIBSVM, ... )
 (name = RandomForestClassifier, package_name = BetaML, ... )
 (name = RandomForestClassifier, package_name = DecisionTree, ... )
 (name = RandomForestClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = RidgeCVClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = RidgeClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = SGDClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = SVC, package_name = LIBSVM, ... )
 (name = SVMClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = SVMLinearClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = SVMNuClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = StableForestClassifier, package_name = SIRUS, ... )
 (name = StableRulesClassifier, package_name = SIRUS, ... )
 (name = SubspaceLDA, package_name = MultivariateStats, ... )
 (name = XGBoostClassifier, package_name = XGBoost, ... )
Thats a lot of models for our data! To narrow it down, we'll analyze the performance of probablistic predictors with pure julia implementations:
model_names: captures the names of the models being evaluatedaccuracies: accuracies of the value of the model accuracy on the test setlog_losses: values of the log loss (cross entropy) on the test setf1_scores: captures the values of F1-Score on the test set
model_names=Vector{String}();
accuracies=[];
log_losses=[];
f1_scores=[];
models_to_evaluate = models(matching(X, y)) do m
    m.prediction_type==:probabilistic && m.is_pure_julia &&
        m.package_name != "SIRUS"
end
p = plot(legendfontsize=7, title="ROC Curve")
plot!([0, 1], [0, 1], linewidth=2, linestyle=:dash, color=:black)
for m in models_to_evaluate
    model=m.name
    pkg = m.package_name
    model_name = "$model ($pkg)"
    @info "Evaluating $model_name. "
    eval(:(clf = @load $model pkg=$pkg verbosity=0))
    clf_machine = machine(clf(), X, y)
    fit!(clf_machine, rows=train, verbosity=0)
    y_pred = MLJ.predict(clf_machine, rows=test);
    fprs, tprs, thresholds = roc_curve(y_pred, y[test])
    plot!(p, fprs, tprs,label=model_name)
    gui()
    push!(model_names, model_name)
    push!(accuracies, accuracy(mode.(y_pred), y[test]))
    push!(log_losses, log_loss(y_pred,y[test]))
    push!(f1_scores, f1score(mode.(y_pred), y[test]))
end
#Adding labels and legend to the ROC-AUC curve
xlabel!("False Positive Rate (positive=malignant)")
ylabel!("True Positive Rate")
Let's collect the data in form a dataframe for a more precise analysis
model_comparison=DataFrame(
    ModelName=model_names,
    Accuracy=accuracies,
    LogLoss=log_losses,
    F1Score=f1_scores
);
Finally, let's sort the data on basis of the log loss:
sort!(model_comparison, [:LogLoss])21×4 DataFrame
 Row │ ModelName                          Accuracy  LogLoss    F1Score
     │ String                             Any       Any        Any
─────┼──────────────────────────────────────────────────────────────────
   1 │ NeuralNetworkClassifier (BetaML)   0.982456  0.0815565  0.97619
   2 │ NeuralNetworkClassifier (MLJFlux)  0.964912  0.0841788  0.953488
   3 │ RandomForestClassifier (Decision…  0.95614   0.108848   0.942529
   4 │ RandomForestClassifier (BetaML)    0.964912  0.111642   0.953488
   5 │ EvoTreeClassifier (EvoTrees)       0.95614   0.127662   0.941176
   6 │ BayesianLDA (MultivariateStats)    0.929825  0.166699   0.9
   7 │ BayesianSubspaceLDA (Multivariat…  0.929825  0.166709   0.9
   8 │ SubspaceLDA (MultivariateStats)    0.938596  0.209371   0.91358
   9 │ AdaBoostStumpClassifier (Decisio…  0.947368  0.275107   0.926829
  10 │ KernelPerceptronClassifier (Beta…  0.894737  0.418525   0.863636
  11 │ KNNClassifier (NearestNeighborMo…  0.95614   0.430947   0.942529
  12 │ PegasosClassifier (BetaML)         0.912281  0.498056   0.891304
  13 │ ConstantClassifier (MLJModels)     0.622807  0.662744   0.0
  14 │ LDA (MultivariateStats)            0.938596  0.677149   0.91358
  15 │ GaussianNBClassifier (NaiveBayes)  0.929825  0.898701   0.906977
  16 │ PerceptronClassifier (BetaML)      0.947368  1.36307    0.926829
  17 │ MultinomialClassifier (MLJLinear…  0.938596  1.57496    0.915663
  18 │ DecisionTreeClassifier (BetaML)    0.95614   1.58694    0.941176
  19 │ LogisticClassifier (MLJLinearMod…  0.938596  2.20713    0.915663
  20 │ LinearBinaryClassifier (GLM)       0.912281  2.88829    0.878049
  21 │ DecisionTreeClassifier (Decision…  0.903509  3.4779     0.873563