features, target, and weights
Methods for extracting certain parts of data for all supported calls of the form fit(learner, data).
LearnAPI.features(learner, data) -> <training "features"; suitable input for `predict` or `transform`>
LearnAPI.target(learner, data) -> <target variable>
LearnAPI.weights(learner, data) -> <per-observation weights>Here data is something supported in a call of the form fit(learner, data).
Typical workflow
Not typically appearing in a general user's workflow but useful in meta-alagorithms, such as cross-validation (see the example in obs and Data Interfaces).
Supposing learner is a supervised classifier predicting a vector target:
model = fit(learner, data)
X = LearnAPI.features(learner, data)
y = LearnAPI.target(learner, data)
ŷ = predict(model, Point(), X)
training_loss = sum(ŷ .!= y)Implementation guide
| method | fallback return value | compulsory? |
|---|---|---|
LearnAPI.features(learner, data) | no fallback | no |
LearnAPI.target(learner, data) | no fallback | no |
LearnAPI.weights(learner, data) | no fallback | no |
Reference
LearnAPI.features — FunctionLearnAPI.features(learner, data)Return, for each form of data supported by the call fit(learner, data), the features part X of data.
While "features" will typically have the commonly understood meaning ("covariates" or "prediuctors"), the only learner-generic guaranteed properties of X are:
Xcan be passed topredictortransformwhen these are supported bylearner, as in the callpredict(model, X), wheremodel = fit(learner, data).Xhas the same number of observations asdatahas and is guaranteed to implement the data interface specified byLearnAPI.data_interface(learner).
Extended help
New implementations
Implementation of this method allows for certain meta-functionality, such as cross-validation. It can only be implemented for LearnAPI.Descriminative learners.
If obs is being overloaded, then typically it suffices to overload LearnAPI.features(learner, observations) where observations = obs(learner, data) and data is any documented supported data in calls of the form fit(learner, data), and to add a declaration of the form
LearnAPI.features(learner, data) = LearnAPI.features(learner, obs(learner, data))to catch all other forms of supported input data.
Ensure the returned object, implements the data interface specified by LearnAPI.data_interface(learner).
:(LearnAPI.features) must be included in the return value of LearnAPI.functions(learner), unless the learner is static (fit consumes no data).
If implemented, you must include :(LearnAPI.target) in the tuple returned by the LearnAPI.functions trait.
LearnAPI.target — FunctionLearnAPI.target(learner, data) -> targetReturn, for each form of data supported by the call fit(learner, data), the target part of data, in a form suitable for pairing with predictions. The return value is only meaningful if learner is supervised, i.e., if :(LearnAPI.target) in LearnAPI.functions(learner).
The returned object has the same number of observations as data has and is guaranteed to implement the data interface specified by LearnAPI.data_interface(learner).
Extended help
What is a target variable?
Examples of target variables are house prices in real estate pricing estimates, the "spam"/"not spam" labels in an email spam filtering task, "outlier"/"inlier" labels in outlier detection, cluster labels in clustering problems, and censored survival times in survival analysis. For more on targets and target proxies, see the "Reference" section of the LearnAPI.jl documentation.
New implementations
The method should be overloaded if fit consumes data that includes a target variable (in the sense above). This will include both LearnAPI.Descriminative and LearnAPI.Generative learners, but never LearnAPI.Static learners. Implementation allows for certain meta-functionality, such as cross-validation in supervised learning, supervised anomaly detection, and density estimation.
If obs is being overloaded, then typically it suffices to overload LearnAPI.target(learner, observations) where observations = obs(learner, data) and data is any documented supported data in calls of the form fit(learner, data), and to then add a declaration of the form
LearnAPI.target(learner, data) = LearnAPI.target(learner, obs(learner, data))to catch all other forms of supported input data.
Remember to ensure the return value of LearnAPI.target implements the data interface specified by LearnAPI.data_interface(learner).
If implemented, you must include :(LearnAPI.target) in the tuple returned by the LearnAPI.functions trait.
LearnAPI.weights — FunctionLearnAPI.weights(learner, data) -> weightsReturn, for each form of data supported by the call fit(learner, data), the per-observation weights part of data.
The returned object has the same number of observations as data has and is guaranteed to implement the data interface specified by LearnAPI.data_interface(learner).
Where nothing is returned, weighting is understood to be uniform.
Extended help
New implementations
Implementing is optional.
If obs is being implemented, then typically it suffices to overload LearnAPI.weights(learner, observations) where observations = obs(learner, data) and data is any documented supported data in calls of the form fit(learner, data), and to add a declaration of the form
LearnAPI.weights(learner, data) = LearnAPI.weights(learner, obs(learner, data))to catch all other forms of supported input data.
Ensure the returned object, unless nothing, implements the data interface specified by LearnAPI.data_interface(learner).
If implemented, you must include :(LearnAPI.weights) in the tuple returned by the LearnAPI.functions trait.