Internals

This page contains documentation for non-public API, for maintainers of MLJModelRegistryTools.jl.

MLJModelRegistryTools.GenericRegistryModule
GenericRegistry

Module providing basic tools to manage a package registry, by which is meant a package environment, together with "package metadata", in the form of a dictionary of TOML-parsable values, keyed on the environment's package dependencies, which is stored in a TOML file. (This file is called Metadata.toml and is located in the same folder as environment Project.toml file.) Not to be confused with a package registry in the sense of the standard library, Pkg.

Methods

  • GenericRegistry.dependencies(environment): Get a list of the environment's dependencies (vector of package name strings).

  • GenericRegistry.put: Insert an item in the metadata dictionary

  • GenericRegistry.get: Inspect the metadata

  • GenericRegistry.gc: Remove key-value pairs fromn the metadata for package keys no longer dependencies in the environment. (In any case, get will return nothing for any pkg not currently a dependency.)

  • GenericRegistry.run: In a new Julia process, load a package or packages and execute a Julia expression there; results are returned as Future objects, to allow asynchronous run calls. Useful for generating metadata about a package.

  • GenericRegistry.close(future): Shut down the process intitiated by the run call that returned future (after calling fetch(future) to get the result of evaluation).

Example

using Pkg
env = "/Users/anthony/MyEnv"
Pkg.activate(env)
Pkg.status()
# Status `~/MyEnv/Project.toml`
#  [7876af07] Example v0.5.5
#  [bd369af6] Tables v1.12.1

Pkg.activate(temp=true)
Pkg.add("MLJModelRegistryTools")
using MLJModelRegistryTools.GenericRegistry
packages = GenericRegistry.dependencies(env)
# 2-element Vector{String}:
#  "Tables"
#  "Example"

future = GenericRegistry.run(["Tables",], :(names(Tables)))
value = fetch(future)
# 3-element Vector{Symbol}:
#  :Tables
#  :columntable
#  :rowtable

GenericRegistry.close(future)
GenericRegistry.put("Tables", string.(value), env)
read("/Users/anthony/MyEnv/Metadata.toml", String)
# "Tables = ["Tables", "columntable", "rowtable"]
"

GenericRegistry.get("Tables", env)
# 3-element Vector{String}:
#  "Tables"
#  "columntable"
#  "rowtable"
source
MLJModelRegistryTools.GenericRegistry.gcMethod
GenericRegistry.gc(environment)

Remove key-value pairs from the metadata dictionary associated with the specified environment in all cases in which the key is not a package dependency. An optional cleanup operation after removing a package from the environment's dependencies.

Does not change behaviour of metadata methods.

source
MLJModelRegistryTools.GenericRegistry.getMethod
GenericRegistry.get(pkg, environment)

Return the metadata associated with package, pkg, if it is a dependency of environment and if pkg is a key in associated metadata dictionary. Otherwise, return nothing.

source
MLJModelRegistryTools.GenericRegistry.runMethod
GenericRegistry.run([setup,] packages, program; environment=nothing)

Assuming a package environment path is specified, do the following in a new Julia process:

  1. Activate environment.

  2. Evaluate the setup expression, if specified.

  3. Instantiate the environment.

  4. import all packages specified in packages.

  5. Evaluate the program expression.

The returned value is a Future object which must be fetched to get the final evaluated expression. Shut the temporary process down by calling GenericRegistry.close on the Future.

Step 3 might typically close by reversing any actions mutating the environment, but remember only the last evaluated expression is passed to the Future.

If environment is unspecified, then a fresh temporary environment is activated, and the packages listed in packages are manually added between Steps 2 and 3 above.

source
MLJModelRegistryTools.metadataFunction
metadata(pkg; registry="", check_traits=true)

Private method.

Extract the metadata for a package. Returns a Future object that must be fetched to get the metadata. See, MLJModelRegistryTools.update, which calls this method, for more details.

Assumes that MLJModelRegistryTools has been developed into registry if this is non-empty.

source
MLJModelRegistryTools.encode_dicFunction
encode_dic(d)

Convert an arbitrary nested dictionary d into a nested dictionary whose leaf values are all strings, suitable for writing to a TOML file (a poor man's serialization). The rules for converting leaves are:

  1. If it's a Symbol, preserve the colon, as in :x -> ":x".

  2. If it's an AbstractString, apply string function (e.g, to remove SubStrings).

  3. In all other cases, except AbstractArrays, first wrap in single quotes, as in sum -> "sum".

  4. Replace any # character in the application of Rule 3 with _ (to handle gensym names)

  5. For an AbstractVector, broadcast the preceding Rules over its elements.

source
MLJModelRegistryTools.modeltype_given_constructorMethod
model_type_given_constructor(modeltypes)

Private method.

Return a dictionary of modeltypes, keyed on constructor. Where multiple types share a single constructor, there can only be one value (and which value appears is not predictable).

Typically a model type and it's constructor have the same name, but for wrappers, such as TunedModel, several types share the same constructor (e.g., DeterministicTunedModel, ProbabilisticTunedModel are model types sharing constructor TunedModel).

source
MLJModelRegistryTools.traits_given_constructor_nameFunction
MLJModelRegistryTools.traits_given_constructor_name(pkg; check_traits=true)

Build and return a dictionary of model metadata as follows: The keys consist of the names of constructors of any model object subtyping MLJModelInterface.Model wherever the package providing the model implementation (assumed to be imported) is pkg. This is the package appearing as the root of MLJModelInterface.load_path(model). The values are corresponding dictionaries of traits, keyed on trait name.

Poor man's serialization, as provided by [MLJRegistry.encode_dic)(@ref), is applied to the dictionary, to make it suitable for writing to TOML files.

Also, apply smoke tests to the associated trait definitions, assuming check_traits=true.

source