Available models

Your help is welcome to help extend these lists. Note the current limitations:

The models are built and tested assuming n > p; if this doesn't hold, tricks should be employed to speed up computations; these have not been implemented yet.
CV-aware code not implemented yet (code that re-uses computations when fitting over a number of hyper-parameters); "Meta" functionalities such as One-vs-All or Cross-Validation are left to other packages such as MLJ.
No support yet for sparse matrices.
Stochastic solvers have not yet been implemented.
All computations are assumed to be done in Float64.

Regression models

Regressors	Formulation¹	Available solvers	Comments
OLS & Ridge	L2Loss + 0/L2	Analytical² or CG³
Lasso & Elastic-Net	L2Loss + 0/L2 + L1	(F)ISTA⁴
Robust 0/L2	RobustLoss⁵ + 0/L2	Newton, NewtonCG, LBFGS, IWLS-CG⁶	no scale⁷
Robust L1/EN	RobustLoss + 0/L2 + L1	(F)ISTA
Quantile⁸ + 0/L2	RobustLoss + 0/L2	LBFGS, IWLS-CG
Quantile L1/EN	RobustLoss + 0/L2 + L1	(F)ISTA

"0" stands for no penalty
Analytical means the solution is computed in "one shot" using the \ solver,
CG = conjugate gradient
(Accelerated) Proximal Gradient Descent
Huber, Andrews, Bisquare, Logistic, Fair and Talwar weighing functions available.
Iteratively re-Weighted Least Squares where each system is solved iteratively via CG
In other packages such as Scikit-Learn, a scale factor is estimated along with the parameters, this is a bit ad-hoc and corresponds more to a statistical perspective, further it does not work well with penalties; we recommend using cross-validation to set the parameter of the Huber Loss.
Includes as special case the least absolute deviation (LAD) regression when δ=0.5.

Classification models

Classifiers	Formulation	Available solvers	Comments
Logistic 0/L2	LogisticLoss + 0/L2	Newton, Newton-CG, LBFGS	`yᵢ∈{±1}`
Logistic L1/EN	LogisticLoss + 0/L2 + L1	(F)ISTA	`yᵢ∈{±1}`
Multinomial 0/L2	MultinomialLoss + 0/L2	Newton-CG, LBFGS	`yᵢ∈{1,...,c}`
Multinomial L1/EN	MultinomialLoss + 0/L2 + L1	ISTA, FISTA	`yᵢ∈{1,...,c}`

Unless otherwise specified:

Newton-like solvers use Hager-Zhang line search (default in Optim.jl)
ISTA, FISTA solvers use backtracking line search and a shrinkage factor of β=0.8

Note: these models were all tested for correctness whenever a direct comparison with another package was possible, usually by comparing the objective function at the coefficients returned (cf. the tests):

(against scikit-learn): Lasso, Elastic-Net, Logistic (L1/L2/EN), Multinomial (L1/L2/EN)
(against quantreg): Quantile (0/L1)

Systematic timing benchmarks have not been run yet but it's planned (see this issue).