Document strings

To be registered, MLJ models must include a detailed document string for the model type, and this must conform to the standard outlined below. We recommend you simply adapt an existing compliant document string and read the requirements below if you're not sure, or to use as a checklist. Here are examples of compliant doc-strings (go to the end of the linked files):

A utility function is available for generating a standardized header for your doc-strings (but you provide most detail by hand):

MLJModelInterface.doc_headerFunction
MLJModelInterface.doc_header(SomeModelType; augment=false)

Return a string suitable for interpolation in the document string of an MLJ model type. In the example given below, the header expands to something like this:

FooRegressor

A model type for constructing a foo regressor, based on FooRegressorPkg.jl.

From MLJ, the type can be imported using

FooRegressor = @load FooRegressor pkg=FooRegressorPkg

Construct an instance with default hyper-parameters using the syntax model = FooRegressor(). Provide keyword arguments to override hyper-parameter defaults, as in FooRegressor(a=...).

Ordinarily, doc_header is used in document strings defined after the model type definition, as doc_header assumes model traits (in particular, package_name and package_url) to be defined; see also MLJModelInterface.metadata_pkg.

Example

Suppose a model type and traits have been defined by:

mutable struct FooRegressor
    a::Int
    b::Float64
end

metadata_pkg(FooRegressor,
    name="FooRegressorPkg",
    uuid="10745b16-79ce-11e8-11f9-7d13ad32a3b2",
    url="http://existentialcomics.com/",
    )
metadata_model(FooRegressor,
    input=Table(Continuous),
    target=AbstractVector{Continuous})

Then the docstring is defined after these declarations with the following code:

"""
$(MLJModelInterface.doc_header(FooRegressor))

### Training data

In MLJ or MLJBase, bind an instance `model` ...

<rest of doc string goes here>

"""
FooRegressor

Variation to augment existing document string

For models that have a native API with separate documentation, one may want to call doc_header(FooRegressor, augment=true) instead. In that case, the output will look like this:

From MLJ, the FooRegressor type can be imported using

FooRegressor = @load FooRegressor pkg=FooRegressorPkg

Construct an instance with default hyper-parameters using the syntax model = FooRegressor(). Provide keyword arguments to override hyper-parameter defaults, as in FooRegressor(a=...).

source

The document string standard

Your document string must include the following components, in order:

  • A header, closely matching the example given above.

  • A reference describing the algorithm or an actual description of the algorithm, if necessary. Detail any non-standard aspects of the implementation. Generally, defer details on the role of hyperparameters to the "Hyperparameters" section (see below).

  • Instructions on how to import the model type from MLJ (because a user can already inspect the doc-string in the Model Registry, without having loaded the code-providing package).

  • Instructions on how to instantiate with default hyperparameters or with keywords.

  • A Training data section: explains how to bind a model to data in a machine with all possible signatures (eg, machine(model, X, y) but also machine(model, X, y, w) if, say, weights are supported); the role and scitype requirements for each data argument should be itemized.

  • Instructions on how to fit the machine (in the same section).

  • A Hyperparameters section (unless there aren't any): an itemized list of the parameters, with defaults given.

  • An Operations section: each implemented operation (predict, predict_mode, transform, inverse_transform, etc ) is itemized and explained. This should include operations with no data arguments, such as training_losses and feature_importances.

  • A Fitted parameters section: To explain what is returned by fitted_params(mach) (the same as MLJModelInterface.fitted_params(model, fitresult) - see later) with the fields of that named tuple itemized.

  • A Report section (if report is non-empty): To explain what, if anything, is included in the report(mach) (the same as the report return value of MLJModelInterface.fit) with the fields itemized.

  • An optional but highly recommended Examples section, which includes MLJ examples, but which could also include others if the model type also implements a second "local" interface, i.e., defined in the same module. (Note that each module referring to a type can declare separate doc-strings which appear concatenated in doc-string queries.)

  • A closing "See also" sentence which includes a @ref link to the raw model type (if you are wrapping one).