Registry management tools
Overview
MLJModelRegistryTools
— ModuleMLJModelRegistryTools
Module providing tools for managing the MLJ Model Registry. To modify the registry:
Create a local clone of MLJModels.jl, which hosts the registry. After making changes, you will be making a MLJModels.jl pull request.
If needed, use Julia's package manager to add or remove items from the list of registered packages in the environment "/src/registry/", inside your MLJModels.jl clone. Follow the protocol below.
In a new Julia session with MLJModelRegistryTools.jl installed, run
using MLJModelRegistryTools
to make the management tools available.Point the
MLJModelRegistryTools
module to the location of the registry itself within your MLJModels.jl clone, usingsetpath(path_to_registry)
, as insetpath("MyPkgs/MLJModels.jl/src/registry")
.To add or update the metadata associated with a package, run
update(pkg)
.Assuming this is successful, update the metadata for all packages in the registry by running
update()
.When satisfied, commit your changes to the clone and make a pull request to the MLJModels.jl repository that you cloned.
In any MLJModels.jl pull request to update the Model Registry you should note the final output of Pkg.status(outdated=true)
when you have /src/registry activated.
Protocol for adding new packages to the registry environment
In your local clone of MLJModels.jl,
activate
the environment at "/src/registry/".update
the environmentNote the output of
Pkg.status(outdated=true)
add
the new packageRepeat steps 2 and 3 above, and investigate any dependency downgrades for which your addition may be the cause.
If adding the new package results in downgrades to existing dependencies, because your package is not up to date with it's compatibility bounds, then your pull request to register the new models may be rejected.
Removing a package from the registry environment does not remove its metadata. However if you call update()
to update all package metadata (or call MLJModelRegistryTools.gc()
) the metadata for all orphaned packages is removed.
Methods
MLJModelRegistryTools.setpath
— Functionsetpath(path)
Point MLJModelRegistryTools
to the location of the registry to be modified. Ordinarily, this is the absolute path to the subdirectory /src/registry
of a local clone of MLJModels.jl.
julia> pwd()
"/Users/anthony/GoogleDrive/Julia/MLJ/MLJModels.jl"
julia> setpath("~/GoogleDrive/Julia/MLJ/MLJModels.jl/src/registry")
MLJModelRegistryTools.update
— Functionupdate(pkg; check_traits=true, advanced_options...)
Extract the values of model traits for models in the package pkg
, including document strings, and record this in the MLJ model registry (write it to /registry/Metadata.toml
).
Assumes pkg
is already a dependency in the Julia environment defined at /registry/
and uses the version of pkg
consistent with the current environment manifest, after MLJModelRegistryTools.jl has been develop
ed into that environment (it is removed again after the update). See documentation for details on the registration process.
julia> update("MLJDecisionTreeInterface")
Return value
The metadata dictionary, keyed on models (more precisely, constructors, thereof).
Advanced options
manifest=true
: Set tofalse
to ignore the registry environment manifest and instead add only the specified packages to a new temporary environment. Useful to temporarily force latest versions if these are being blocked by other packages.debug=false
: Callingupdate
opens a temporary Julia process to extract the trait metadata (seeMLJModelRegistryTools.GenericRegistry.run
). By default, this process is shut down before rethrowing any exceptions that occurs there. Settingdebug=true
will leave the process open, and also block the default suppression of the remote worker standard output.
update(; check_traits=true, skip=String[], advanced_options...)
Update all packages in the Registry environment that are not specified in skip
.
julia> update(skip=["MLJBase", "MLJScikitlearnInterface"])
Return value
A set of all names of all packages for which metadata was recorded.
Advanced options
nworkers=otherBase.Sys.CPU_THREADS-1-nworkers())
: number of workers running package updates in parallel. Metadata is extracted in parallel, but written to file sequentially.debug=false
: Set totrue
to leave temporary processes open; see theupdate(pkg; ...)
document string above.manifest=true
: See theupdate(pkg; ...)
document string above.
MLJModelRegistryTools.gc
— FunctionMLJModelRegistryTools.gc()
Remove the metadata associated with any packages that are no longer in the the model registry.
This is performed automatically after update()
, but not after update(pkg)
.
MLJModelRegistryTools.get
— FunctionMLJModelRegistryTools.get(pkg)
Inspect the model trait metadata recorded in the Model Registry for those models in pkg
. Returns a dictionary keyed on model constructor name. Data is in serialized form; see MLJModelRegistryTools.encode_dic
.