`fit`

```
fit(algorithm, data...; verbosity=1) -> model
fit(model, data...; verbosity=1) -> updated_model
```

## Typical workflow

```
# Train some supervised `algorithm`:
model = fit(algorithm, X, y)
# Predict probability distributions:
ŷ = predict(model, Distribution(), Xnew)
# Inspect some byproducts of training:
LearnAPI.feature_importances(model)
```

## Implementation guide

The `fit`

method is not implemented directly. Instead, implement `obsfit`

.

method | fallback | compulsory? | requires |
---|---|---|---|

`obsfit` `(alg, ...)` | none | yes | `obs` in some cases |

## Reference

`LearnAPI.fit`

— Function`LearnAPI.fit(algorithm, data...; verbosity=1)`

Execute the algorithm with configuration `algorithm`

using the provided training `data`

, returning an object, `model`

, on which other methods, such as `predict`

or `transform`

, can be dispatched. `LearnAPI.functions(algorithm)`

returns a list of methods that can be applied to either `algorithm`

or `model`

.

**Arguments**

`algorithm`

: property-accessible object whose properties are the hyperparameters of some ML/statistical algorithm`data`

: tuple of data objects with a common number of observations, for example,`data = (X, y, w)`

where`X`

is a table of features,`y`

is a target vector with the same number of rows, and`w`

a vector of per-observation weights.

`verbosity=1`

: logging level; set to`0`

for warnings only, and`-1`

for silent training

See also `obsfit`

, `predict`

, `transform`

, `inverse_transform`

, `LearnAPI.functions`

, `obs`

.

**Extended help**

**New implementations**

LearnAPI.jl provides the following definition of `fit`

, which is never directly overloaded:

```
fit(algorithm, data...; verbosity=1) =
obsfit(algorithm, Obs(), obs(fit, algorithm, data...); verbosity)
```

Rather, new algorithms should overload `obsfit`

. See also `obs`

.

`LearnAPI.obsfit`

— Function`obsfit(algorithm, obsdata; verbosity=1)`

A lower-level alternative to `fit`

, this method consumes a pre-processed form of user data. Specifically, the following two code snippets are equivalent:

`model = fit(algorithm, data...)`

and

```
obsdata = obs(fit, algorithm, data...)
model = obsfit(algorithm, obsdata)
```

Here `obsdata`

is algorithm-specific, "observation-accessible" data, meaning it implements the MLUtils.jl `getobs`

/`numobs`

interface for observation resampling (even if `data`

does not). Moreover, resampled versions of `obsdata`

may be passed to `obsfit`

in its place.

The use of `obsfit`

may offer performance advantages. See more at `obs`

.

**Extended help**

**New implementations**

Implementation of the following method signature is compulsory for all new algorithms:

`LearnAPI.obsfit(algorithm, obsdata, verbosity)`

Here `obsdata`

has the form explained above. If `obs`

`(fit, ...)`

is not being overloaded, then a fallback gives `obsdata = data`

(always a tuple!). Note that `verbosity`

is a positional argument, not a keyword argument in the overloaded signature.

New implementations must also implement `LearnAPI.algorithm`

.

If overloaded, then the functions `LearnAPI.obsfit`

and `LearnAPI.fit`

must be included in the tuple returned by the `LearnAPI.functions(algorithm)`

trait.

**Non-generalizing algorithms**

If the algorithm does not generalize to new data (e.g, DBSCAN clustering) then `data = ()`

and `obsfit`

carries out no computation, as this happen entirely in a `transform`

and/or `predict`

call. In such cases, `obsfit(algorithm, ...)`

may return `algorithm`

, but another possibility is allowed: To provide a mechanism for `transform`

/`predict`

to report byproducts of the computation (e.g., a list of boundary points in DBSCAN clustering) they are allowed to *mutate* the `model`

object returned by `obsfit`

, which is then arranged to be a mutable struct wrapping `algorithm`

and fields to store the byproducts. In that case, `LearnAPI.predict_or_transform_mutates(algorithm)`

must be overloaded to return `true`

.