public class LASSO extends java.lang.Object implements Regression<double[]>, java.io.Serializable
The Lasso typically yields a sparse solution, of which the parameter vector β has relatively few nonzero coefficients. In contrast, the solution of L2-regularized least squares (i.e. ridge regression) typically has all coefficients nonzero. Because it effectively reduces the number of variables, the Lasso is useful in some contexts.
For over-determined systems (more instances than variables, commonly in machine learning), we normalize variables with mean 0 and standard deviation 1. For under-determined systems (less instances than variables, e.g. compressed sensing), we assume white noise (i.e. no intercept in the linear model) and do not perform normalization. Note that the solution is not unique in this case.
There is no analytic formula or expression for the optimal solution to the L1-regularized least squares problems. Therefore, its solution must be computed numerically. The objective function in the L1-regularized least squares is convex but not differentiable, so solving it is more of a computational challenge than solving the L2-regularized least squares. The Lasso may be solved using quadratic programming or more general convex optimization methods, as well as by specific algorithms such as the least angle regression algorithm.
Modifier and Type | Class and Description |
---|---|
static class |
LASSO.Trainer
Trainer for LASSO regression.
|
Constructor and Description |
---|
LASSO(double[][] x,
double[] y,
double lambda)
Constructor.
|
LASSO(double[][] x,
double[] y,
double lambda,
double tol,
int maxIter)
Constructor.
|
LASSO(smile.math.matrix.Matrix x,
double[] y,
double lambda)
Constructor.
|
LASSO(smile.math.matrix.Matrix x,
double[] y,
double lambda,
double tol,
int maxIter)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
double |
adjustedRSquared()
Returns adjusted R2 statistic.
|
double[] |
coefficients()
Returns the linear coefficients.
|
int |
df()
Returns the degree-of-freedom of residual standard error.
|
double |
error()
Returns the residual standard error.
|
double |
ftest()
Returns the F-statistic of goodness-of-fit.
|
double |
intercept()
Returns the intercept.
|
double |
predict(double[] x)
Predicts the dependent variable of an instance.
|
double |
pvalue()
Returns the p-value of goodness-of-fit test.
|
double[] |
residuals()
Returns the residuals, that is response minus fitted values.
|
double |
RSquared()
Returns R2 statistic.
|
double |
RSS()
Returns the residual sum of squares.
|
double |
shrinkage()
Returns the shrinkage parameter.
|
java.lang.String |
toString() |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
predict
public LASSO(double[][] x, double[] y, double lambda)
x
- a matrix containing the explanatory variables.
NO NEED to include a constant column of 1s for bias.y
- the response values.lambda
- the shrinkage/regularization parameter.public LASSO(double[][] x, double[] y, double lambda, double tol, int maxIter)
x
- a matrix containing the explanatory variables.
NO NEED to include a constant column of 1s for bias.y
- the response values.lambda
- the shrinkage/regularization parameter.tol
- the tolerance for stopping iterations (relative target duality gap).maxIter
- the maximum number of IPM (Newton) iterations.public LASSO(smile.math.matrix.Matrix x, double[] y, double lambda)
x
- a matrix containing the explanatory variables. The variables should be
centered and standardized. NO NEED to include a constant column of 1s for bias.y
- the response values.lambda
- the shrinkage/regularization parameter.public LASSO(smile.math.matrix.Matrix x, double[] y, double lambda, double tol, int maxIter)
x
- a matrix containing the explanatory variables. The variables should be
centered and standardized. NO NEED to include a constant column of 1s for bias.y
- the response values.lambda
- the shrinkage/regularization parameter.tol
- the tolerance for stopping iterations (relative target duality gap).maxIter
- the maximum number of IPM (Newton) iterations.public double[] coefficients()
public double intercept()
public double shrinkage()
public double predict(double[] x)
Regression
predict
in interface Regression<double[]>
x
- the instance.public double[] residuals()
public double RSS()
public double error()
public int df()
public double RSquared()
In the case of ordinary least-squares regression, R2 increases as we increase the number of variables in the model (R2 will not decrease). This illustrates a drawback to one possible use of R2, where one might try to include more variables in the model until "there is no more improvement". This leads to the alternative approach of looking at the adjusted R2.
public double adjustedRSquared()
public double ftest()
public double pvalue()
public java.lang.String toString()
toString
in class java.lang.Object