OpenML

MLJBase.OpenML.loadMethod
OpenML.load(id)

Load the OpenML dataset with specified id, from those listed on the OpenML site.

Returns a "row table", i.e., a Vector of identically typed NamedTuples. A row table is compatible with the Tables.jl interface and can therefore be readily converted to other compatible formats. For example:

using DataFrames
rowtable = OpenML.load(61);
df = DataFrame(rowtable);
df2 = coerce(df, :class=>Multiclass)
source
MLJBase.OpenML.load_Data_FeaturesMethod

Returns a list of all data qualities in the system.

  • 271 - Unknown dataset. Data set with the given data ID was not found (or is not shared with you).
  • 272 - No features found. The dataset did not contain any features, or we could not extract them.
  • 273 - Dataset not processed yet. The dataset was not processed yet, features are not yet available. Please wait for a few minutes.
  • 274 - Dataset processed with error. The feature extractor has run into an error while processing the dataset. Please check whether it is a valid supported file. If so, please contact the API admins.
source
MLJBase.OpenML.load_Data_QualitiesMethod

Returns the qualities of a dataset.

  • 360 - Please provide data set ID
  • 361 - Unknown dataset. The data set with the given ID was not found in the database, or is not shared with you.
  • 362 - No qualities found. The registered dataset did not contain any calculated qualities.
  • 363 - Dataset not processed yet. The dataset was not processed yet, no qualities are available. Please wait for a few minutes.
  • 364 - Dataset processed with error. The quality calculator has run into an error while processing the dataset. Please check whether it is a valid supported file. If so, contact the support team.
  • 365 - Interval start or end illegal. There was a problem with the interval start or end.
source
MLJBase.OpenML.load_Data_Qualities_ListMethod

Returns a list of all data qualities in the system.

  • 412 - Precondition failed. An error code and message are returned
  • 370 - No data qualities available. There are no data qualities in the system.
source
MLJBase.OpenML.load_Dataset_DescriptionMethod

Returns information about a dataset. The information includes the name, information about the creator, URL to download it and more.

  • 110 - Please provide data_id.
  • 111 - Unknown dataset. Data set description with data_id was not found in the database.
  • 112 - No access granted. This dataset is not shared with you.
source
MLJBase.OpenML.load_List_And_FilterMethod

List datasets, possibly filtered by a range of properties. Any number of properties can be combined by listing them one after the other in the form '/data/list/{filter}/{value}/{filter}/{value}/...' Returns an array with all datasets that match the constraints.

Any combination of these filters /limit/{limit}/offset/{offset} - returns only {limit} results starting from result number {offset}. Useful for paginating results. With /limit/5/offset/10, results 11..15 will be returned.

Both limit and offset need to be specified. /status/{status} - returns only datasets with a given status, either 'active', 'deactivated', or 'inpreparation'. /tag/{tag} - returns only datasets tagged with the given tag. /{dataquality}/{range} - returns only tasks for which the underlying datasets have certain qualities. {dataquality} can be dataid, dataname, dataversion, numberinstances, numberfeatures, numberclasses, numbermissingvalues. {range} can be a specific value or a range in the form 'low..high'. Multiple qualities can be combined, as in 'numberinstances/0..50/number_features/0..10'.

  • 370 - Illegal filter specified.
  • 371 - Filter values/ranges not properly specified.
  • 372 - No results. There where no matches for the given constraints.
  • 373 - Can not specify an offset without a limit.
source