Registry management tools

Overview

MLJModelRegistryToolsModule
MLJModelRegistryTools

Module providing tools for managing the MLJ Model Registry. To modify the registry:

  • Create a local clone of MLJModels.jl, which hosts the registry. After making changes, you will be making a MLJModels.jl pull request.

  • If needed, use Julia's package manager to add or remove items from the list of registered packages in the environment "/src/registry/", inside your MLJModels.jl clone. Follow the protocol below.

  • In a new Julia session with MLJModelRegistryTools.jl installed, run using MLJModelRegistryTools to make the management tools available.

  • Point the MLJModelRegistryTools module to the location of the registry itself within your MLJModels.jl clone, using setpath(path_to_registry), as in setpath("MyPkgs/MLJModels.jl/src/registry").

  • To add or update the metadata associated with a package, run update(pkg).

  • Assuming this is successful, update the metadata for all packages in the registry by running update().

  • When satisfied, commit your changes to the clone and make a pull request to the MLJModels.jl repository that you cloned.

Important

In any MLJModels.jl pull request to update the Model Registry you should note the final output of Pkg.status(outdated=true) when you have /src/registry activated.

Protocol for adding new packages to the registry environment

  1. In your local clone of MLJModels.jl, activate the environment at "/src/registry/".

  2. update the environment

  3. Note the output of Pkg.status(outdated=true)

  4. add the new package

  5. Repeat steps 2 and 3 above, and investigate any dependency downgrades for which your addition may be the cause.

If adding the new package results in downgrades to existing dependencies, because your package is not up to date with it's compatibility bounds, then your pull request to register the new models may be rejected.

Note

Removing a package from the registry environment does not remove its metadata. However if you call update() to update all package metadata (or call MLJModelRegistryTools.gc()) the metadata for all orphaned packages is removed.

source

Methods

MLJModelRegistryTools.setpathFunction
setpath(path)

Point MLJModelRegistryTools to the location of the registry to be modified. Ordinarily, this is the absolute path to the subdirectory /src/registry of a local clone of MLJModels.jl.

julia> pwd()
"/Users/anthony/GoogleDrive/Julia/MLJ/MLJModels.jl"

julia> setpath("~/GoogleDrive/Julia/MLJ/MLJModels.jl/src/registry")
source
MLJModelRegistryTools.updateFunction
update(pkg; check_traits=true, advanced_options...)

Extract the values of model traits for models in the package pkg, including document strings, and record this in the MLJ model registry (write it to /registry/Metadata.toml).

Assumes pkg is already a dependency in the Julia environment defined at /registry/ and uses the version of pkg consistent with the current environment manifest, after MLJModelRegistryTools.jl has been developed into that environment (it is removed again after the update). See documentation for details on the registration process.

julia> update("MLJDecisionTreeInterface")

Return value

The metadata dictionary, keyed on models (more precisely, constructors, thereof).

Advanced options

Warning

Advanced options are intended primarily for diagnostic purposes.

  • manifest=true: Set to false to ignore the registry environment manifest and instead add only the specified packages to a new temporary environment. Useful to temporarily force latest versions if these are being blocked by other packages.

  • debug=false: Calling update opens a temporary Julia process to extract the trait metadata (see MLJModelRegistryTools.GenericRegistry.run). By default, this process is shut down before rethrowing any exceptions that occurs there. Setting debug=true will leave the process open, and also block the default suppression of the remote worker standard output.

source
update(; check_traits=true, skip=String[], advanced_options...)

Update all packages in the Registry environment that are not specified in skip.

julia> update(skip=["MLJBase", "MLJScikitlearnInterface"])

Return value

A set of all names of all packages for which metadata was recorded.

Advanced options

  • nworkers=otherBase.Sys.CPU_THREADS-1-nworkers()): number of workers running package updates in parallel. Metadata is extracted in parallel, but written to file sequentially.

  • debug=false: Set to true to leave temporary processes open; see the update(pkg; ...) document string above.

  • manifest=true: See the update(pkg; ...) document string above.

source
MLJModelRegistryTools.gcFunction
MLJModelRegistryTools.gc()

Remove the metadata associated with any packages that are no longer in the the model registry.

This is performed automatically after update(), but not after update(pkg).

source