PartLS

PartLS

A model type for fitting a partitioned least squares model to data. Both an MLJ and native interface are provided.

MLJ Interface

From MLJ, the type can be imported using

PartLS = @load PartLS pkg=PartitionedLS

Construct an instance with default hyper-parameters using the syntax model = PartLS(). Provide keyword arguments to override hyper-parameter defaults, as in model = PartLS(P=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any matrix or table with Continuous element scitype. Check column scitypes of a table X with schema(X).
  • y: any vector with Continuous element scitype. Check scitype with scitype(y).

Train the machine using fit!(mach).

Hyper-parameters

  • Optimizer: the optimization algorithm to use. It can be Opt, Alt or BnB (names exported by PartitionedLS.jl).

  • P: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.

  • η: the regularization parameter. It controls the strength of the regularization.

  • ϵ: the tolerance parameter. It is used to determine when the Alt optimization algorithm has converged. Only used by the Alt algorithm.

  • T: the maximum number of iterations. It is used to determine when to stop the Alt optimization algorithm has converged. Only used by the Alt algorithm.

  • rng: the random number generator to use.

    • If nothing, the global random number generator rand is used.
    • If an integer, the global number generator rand is used after seeding it with the given integer.
    • If an object of type AbstractRNG, the given random number generator is used.

Operations

  • predict(mach, Xnew): return the predictions of the model on new data Xnew

Fitted parameters

The fields of fitted_params(mach) are:

  • α: the values of the α variables. For each partition k, it holds the values of the α variables are such that $\sum_{i \in P_k} \alpha_{k} = 1$.
  • β: the values of the β variables. For each partition k, β_k is the coefficient that multiplies the features in the k-th partition.
  • t: the intercept term of the model.
  • P: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.

Examples

PartLS = @load PartLS pkg=PartitionedLS

X = [[1. 2. 3.];
     [3. 3. 4.];
     [8. 1. 3.];
     [5. 3. 1.]]

y = [1.;
     1.;
     2.;
     3.]

P = [[1 0];
     [1 0];
     [0 1]]


model = PartLS(P=P)
mach = machine(model, X, y) |> fit!

## predictions on the training set:
predict(mach, X)

Native Interface

using PartitionedLS

X = [[1. 2. 3.];
     [3. 3. 4.];
     [8. 1. 3.];
     [5. 3. 1.]]

y = [1.;
     1.;
     2.;
     3.]

P = [[1 0];
     [1 0];
     [0 1]]


## fit using the optimal algorithm
result = fit(Opt, X, y, P, η = 0.0)
y_hat = predict(result.model, X)

For other fit keyword options, refer to the "Hyper-parameters" section for the MLJ interface.