MLJ News
News for MLJ and its satellite packages: MLJBase, MLJModels, and ScientificTypes.
Latest release notes
MLJ (general users)
MLJBase | MLJModels | ScientificTypes (mainly for developers)
News
Note: New patch releases are no longer being announced below. Refer to the links above for complete release notes.
30 Oct 2019
MLJModels 0.5.3 released.
MLJBase 0.7.2 released.
22 Oct 2019
MLJ 0.5.1 released.
21 Oct 2019
MLJBase 0.7.1 released.
ScientificTypes 0.2.2 released.
MLJModels 0.5.2 released.
17 Oct 2019
MLJBase 0.7 released.
11 Oct 2019
MLJModels 0.5.1 released.
30 Sep 2019
MLJ 0.5 released.
29 Sep 2019
MLJModels 0.5 released.
26 Sep 2019
MLJBase 0.6 released.
Older release notes
MLJ 0.4.0
(Enhancment) Update to MLJBase 0.5.0 and MLJModels 0.4.0. In particular, this updates considerably the list of wrapped scikit-learn models available to the MLJ user:
- ScikitLearn.jl
- SVM:
SVMClassifier,SVMRegressor,SVMNuClassifier,SVMNuRegressor,SVMLClassifier,SVMLRegressor, - Linear Models (regressors):
ARDRegressor,BayesianRidgeRegressor,ElasticNetRegressor,ElasticNetCVRegressor,HuberRegressor,LarsRegressor,LarsCVRegressor,LassoRegressor,LassoCVRegressor,LassoLarsRegressor,LassoLarsCVRegressor,LassoLarsICRegressor,LinearRegressor,OrthogonalMatchingPursuitRegressor,OrthogonalMatchingPursuitCVRegressor,PassiveAggressiveRegressor,RidgeRegressor,RidgeCVRegressor,SGDRegressor,TheilSenRegressor
- SVM:
- ScikitLearn.jl
(Enhancement) The macro
@pipelineallows one to construct linear (non-branching) pipeline composite models with one line of code. One may include static transformations (ordinary functions) in the pipeline, as well as target transformations for the supervised case (when one component model is supervised).(Breaking) Source nodes (type
Source) now have akindfield, which is either:input,:targetor:other, with:inputthe default value in thesourceconstructor. If building a learning network, and the network is to be exported as a standalone model, then it is now necessary to tag the source nodes accordingly, as inXs = source(X)andys = source(y, kind=:target).(Breaking) By virtue of the preceding change, the syntax for exporting a learning network is simplified. Do
?@from_networkfor details. Also, one now usesfitresults(N)instead offit results(N, X, y)andfitresults(N, X)when exporting a learning networkN"by hand"; see the updated manual for details.(Breaking) One must explicitly state if a supervised learning network being exported with
@from_networkis probabilistic by addingis_probablistic=trueto the macro expression. Before, this information was unreliably inferred from the network.(Enhancement) Add macro-free method for loading model code into an arbitrary module. Do
?loadfor details.(Enhancement)
@loadnow returns a mode instance with default hyperparameters (instead of nothing), as intree_model = @load DecisionTreeRegressor(Breaking)
info("PCA")now returns a named-tuple, instead of a dictionary, of the properties of a the model named "PCA"(Breaking) The list returned by
models(conditional)is now a list of complete metadata entries (named-tuples, as returned byinfo). An entryproxyappears in the list exactly whenconditional(proxy) == true. Model query is simplified; for examplemodels() do model model.is_supervised && model.is_pure_julia endfinds all pure julia supervised models.(Bug fix) Introduce new private methods to avoid relying on MLJBase type piracy MLJBase #30.
(Enhancement) If
compositeis a a learning network exported as a model, andm = machine(composite, args...)thenreport(m)returns the reports for each machine in the learning network, and similarly forfitted_params(m).(Enhancement)
MLJ.table,vcatandhcatnow overloaded forAbstractNode, so that they can immediately be used in defining learning networks. For example, ifX = source(rand(20,3))andy=source(rand(20))thenMLJ.table(X)andvcat(y, y)both make sense and define new nodes.(Enhancement)
pretty(X)prints a pretty version of any tableX, complete with types and scitype annotations. Do?prettyfor options. A wrap ofpretty_tablefromPrettyTables.jl.(Enhancement)
stdis re-exported fromStatistics(Enhancement) The manual and MLJ cheatsheet have been updated.
Performance measures have been migrated to MLJBase, while the model registry and model load/search facilities have migrated to MLJModels. As relevant methods are re-exported to MLJ, this is unlikely to effect many users.
MLJModels 0.4.0
(Enhancement) Add a number of scikit-learn model wraps. See the above MLJ 0.4.0 release notes for a detailed list.
The following have all been migrated to MLJModels from MLJ:
- MLJ's built-in models (e.g., basic transformers such as
OneHotEncoder)
- MLJ's built-in models (e.g., basic transformers such as
- The model registry metadata (src/registry/METADATA.toml)
- The metadata `@update` facility for administrator registration
of new models
- The `@load` macro and `load` function for loading code for a registered model
- The `models` and `localmodels` model-search functions
- The `info` command for returning the metadata entry of a model- (Breaking) MLJBase v0.5.0, which introduces some changes and additions to model traits, is a requirement, meaning the format of metadata as changed.
- (Breaking) The
modelmethod for retrieving model metadata has been renamed back toinfo, but continues to return a named-tuple. (TheMLJBase.infomethod, returning the dictionary form of the metadata, is now calledMLJBase.info_dic).
MLJBase 0.5.0
Bump ScientificTypes requirement to v0.2.0
(Enhancement) The performance measures API (built-in measures + adaptor for external measures) from MLJ has been migrated to MLJBase. MLJ.
(Breaking)
info, which returns a dictionary (needed for TOML serialization) is renamed toinfo_dic. In this way "info" is reserved for a method in MLJModels/MLJ that returns a more-convenient named-tuple(Breaking) The
is_probabilisticmodel trait is replaced withprediction_type, which can have the values:deterministic,:probabilisticor:interval, to allow for models predicting real intervals, and for consistency with measures API.(Bug fix, mildly breaking) The
package_licensemodel trait is now included ininfo_dictin the case of unsupervisd models.(Enhancement, mildly breaking) Add new model traits
hyperparameters,hyperparameter_types,docstring, andimplemented_operations(fit,predict,inverse_transform, etc) (#36, #37, #38)(Enhancement) The
MLJBase.tableandMLJBase.matrixoperations are now direct wraps of the correspondingTables.jloperations for improved performance. In particularMLJBase.matrix(MLJBase.table(A))is essentially a non-operation, and one can passMLJBase.matrixthe keyword argumenttranspose=....(Breaking) The built-in dataset methods
load_iris,load_boston,load_ames,load_reduced_ames,load_crabsreturn a rawDataFrame, instead of anMLJTaskobject, and continue to requireimport CSVto become available. However, macro versions@load_iris, etc, are always available, automatically triggeringimport CSV; these macros return a tuple(X, y)of inputDataFrameand target vectory, with scitypes appropriately coerced. (MLJ #224)(Enhancement)
selectrowsnow works for matrices. Needed to allow matrices as "node type" in MLJ learning networks; see MLJ #209.(Bug) Fix problem with
==forMLJTypeobjects (#35)(Breaking) Update requirement on ScientficTypes.jl to v0.2.0 to mitigate bug with coercion of column scitypes for tables that are also AbstractVectors, and to make
coercemore convenient.(Enhancement) Add new method
unpackfor splitting tables, as iny, X = unpack(df,==(:target),!=(:dummy)). See doc-string for details.(Bug fix) Remove type piracy in get/setproperty! (#30)
ScientificTypes 0.2.0
(Breaking) The argument order is switched in
coercemethods. So now usecoerce(v, T)for a vectorvand scientific typeTandcoerce(X, d)for a tableXand dictionaryd.(Feature) You can now call
coerceon tables without needing to wrap specs in a dictionary, as inscitype(X, :age => Continuous, :ncalls => Count).
ScientficTypes 0.1.3
MLJ 0.4.0
Introduction of traits for measures (loss functions, etc); see top of /src/measures.jl for definitions. This
- allows user to use loss functions from LossFunctions.jl,
- enables improved measure checks and error message reporting with measures
- allows
evaluate!to report per-observation measures when available (for later use by Bayesian optimisers, for example) - allows support for sample-weighted measures playing nicely with rest of API
Improvements to resampling:
evaluate!method now reports per-observation measures when available- sample weights can be passed to
evaluate!for use by measures that support weights - user can pass a list of train/evaluation pairs of row indices directly to
evaluate!, in place of aResamplingStrategyobject - implementing a new
ResamplingStrategyis now straightforward (see docs) - one can call
evaluate(no exclamation mark) directly on model + data without first constructing a machine, if desired
Doc strings and the manual have been revised and updated. The manual includes a new section "Tuning models", and extra material under "Learning networks" explaining how to export learning networks as stand-alone models using the
@from_networkmacro.Improved checks and error-reporting for binding models to data in machines.
(Breaking) CSV is now an optional dependency, which means you now need to import CSV before you can load tasks with
load_boston(),load_iris(),load_crabs(),load_ames(),load_reduced_ames()Added
schemamethod for tables (re-exported from ScientificTypes.jl). Returns a named tuple with keys:names,:types,:scitypesand:nrows.(Breaking) Eliminate
scitypesmethod. The scientific types of a table are returned as part of ScientificTypesschemamethod (see above)
MLJModels 0.3.0
MLJBase v0.4.0
ScientificTypes 0.1.2
- New package to which the scientific types API has been moved (from MLJBase).
MLJBase v0.3.0
- Make CSV an optional dependency (breaking). To use
load_iris(),load_ames(), etc, need first to import CSV.
MLJBase v0.2.4
Add ColorImage and GreyImage scitypes
Overload
inmethod for subtypes ofModel(apparently causing Julia crashes in an untagged commit, because of a method signature ambiguity, now resolved).
MLJ v0.2.5
Add MLJ cheatsheet
Allow
modelsto query specific traits, in addition to tasks. Query?modelsfor detailsadd
@from_networksmacro for exporting learning networks as models (experimental).
MLJModels v0.2.4
- Add compatibility requirement MLJBase="0.2.3"
MLJBase v0.2.3
- Small changes on definitions of
==andisequalforMLJTypeobjects. In particular, fields that are random number generators may change state without effecting an object's==equivalence class. - Add
@set_defaultsmacro for generating keywork constructors forModelsubtypes.
- Add abstract type
UnsupervisedNetwork <: Unsupervised.
MLJ v0.2.3
Fixed bug in models(::MLJTask) method which excluded some relevant models. (#153)
Fixed some broken links to the tour.ipynb.
MLJ v0.2.2
Resolved these isssues:
- Specifying new rows in calls to
fit!on a Node not triggering retraining. (#147)
- Specifying new rows in calls to
- fit! of Node sometimes calls `update` on model when it should
call `fit` on model
[(#146)](https://github.com/alan-turing-institute/MLJ.jl/issues/146)
- Error running the tour.ipynb notebook
[(#140)](https://github.com/alan-turing-institute/MLJ.jl/issues/140)
- For reproducibility, include a Manifest.toml file with all
examples. [(#137)](https://github.com/alan-turing-institute/MLJ.jl/issues/137)- Activated overalls code coverage (#131)
Removed local version of MultivariateStats (now in MLJModels, see below).
Minor changes to OneHotEncoder, in line with scitype philosophy.
MLJBase v0.2.2
Fix some minor bugs.
Added compatibility requirement CSV v0.5 or higher to allow removal of
allowmissingkeyword inCSV.read, which is to be depreciated.
Announcement: MLJ tutorial and development sprint
- Details here Applications close May 29th 5pm (GMTT + 1 = London)
MLJModels v0.2.3
- The following support vector machine models from LIBSVM.jl have been added: EpsilonSVR, LinearSVC, NuSVR, NuSVC, SVC, OneClassSVM.
MLJModels v0.2.2
- MulitivariateStats models RidgeRegressor and PCA migrated here from MLJ. Addresses: MLJ #125.
MLJModels v0.2.1
- ScikitLearn wraps ElasticNet and ElasticNetCV now available (and registered at MLJRegistry). Resolves: MLJ #112
MLJ v0.2.1
- Fix a bug and related problem in "Getting Started" docs: [#126](https://github.com/alan-turing-institute/MLJ.jl/issues/126 .
MLJBase 0.2.0, MLJModels 0.2.0, MLJ 0.2.0
- Model API refactored to resolve #93 and #119 and hence simplify the model interface. This breaks all implementations of supervised models, and some scitype methods. However, for the regular user the effects are restricted to: (i) no more
target_typehyperparameter for some models; (ii)Deterministic{Node}is nowDeterministicNetworkandProbabillistic{Node}is nowProbabilisticNetworkwhen exporting learning networks as models. - New feature: Task constructors now allow the user to explicitly specify scitypes of features/target. There is a
coercemethod for vectors and tables for the user who wants to do this manually. Resolves: #119
Official registered versions of MLJBase 0.1.1, MLJModels 0.1.1, MLJ 0.1.1 released
- Minor revisions to the repos, doc updates, and a small breaking change around scitype method names and associated traits. Resolves: #119
unversioned commits 12 April 2019 (around 00:10, GMT)
- Added out-of-bag estimates for performance in homogeneous ensembles. Resolves: #77
unversioned commits 11 April 2019 (before noon, GMT)
- Removed dependency on unregistered package TOML.jl (using, Pkg.TOML instead). Resolves #113
unversioned commits 8 April 2019 (some time after 20:00 GMT)
Addition of XGBoost models XGBoostRegressor, XGBoostClassifier and XGBoostCount. Resolves #65.
Documentation reorganized as GitHub pages. Includes some additions but still a work in progress.
unversioned commits 1 March 2019 (some time after 03:50 GMT)
Addition of "scientific type" hierarchy, including
Continuous,Discrete,Multiclass, andOthersubtypes ofFound(to complementMissing). See Getting Started for more one this. Resolves: #86Revamp of model traits to take advantage of scientific types, with
output_kindreplaced withtarget_scitype_union,input_kindreplaced withinput_scitype. Also,output_quantitydropped,input_quantityreplaced withBool-valuedinput_is_multivariate, andis_pure_juliamadeBool-valued. Trait definitions in all model implementations and effected meta-algorithms have been updated. Related: #81Substantial update of the core guide Adding New Models to reflect above changes and in response to new model implementer queries. Some design "decisions" regarding multivariate targets now explict there.
the order the
yandyhatarguments of measures (aka loss functions) have been reversed. Progress on: #91Update of Standardizer and OneHotEncoder to mesh with new scitypes.
New improved task constructors infer task metadata from data scitypes. This brings us close to a simple implementation of basic task-model matching. Query the doc-strings for
SupervisedTaskandUnsupervisedTaskfor details. Machines can now dispatch on tasks instead ofXandy. A task,task, is now callable:task()returns(X, y)for supervised models, andXfor unsupervised models. Progress on: #86the data in the
load_ames()test task has been replaced by the full data set, andload_reduced_ames()now loads a reduced set.