LinearModel

java.lang.Object
- smile.regression.LinearModel

All Implemented Interfaces:

java.io.Serializable, java.util.function.ToDoubleFunction<double[]>, DataFrameRegression, OnlineRegression<double[]>, Regression<double[]>
```
public class LinearModel
extends java.lang.Object
implements OnlineRegression<double[]>, DataFrameRegression
```
Linear model. In linear regression, the model specification is that the dependent variable is a linear combination of the parameters (but need not be linear in the independent variables). The residual is the difference between the value of the dependent variable predicted by the model, and the true value of the dependent variable.
Once a regression model has been constructed, it may be important to confirm the goodness of fit of the model and the statistical significance of the estimated parameters. Commonly used checks of goodness of fit include the R-squared, analysis of the pattern of residuals and hypothesis testing. Statistical significance can be checked by an F-test of the overall fit, followed by t-tests of individual parameters.
Interpretations of these diagnostic tests rest heavily on the model assumptions. Although examination of the residuals can be used to invalidate a model, the results of a t-test or F-test are sometimes more difficult to interpret if the model's assumptions are violated. For example, if the error term does not have a normal distribution, in small samples the estimated parameters will not follow normal distributions and complicate inference. With relatively large samples, however, a central limit theorem can be invoked such that hypothesis testing may proceed using asymptotic approximations.

See Also:

Serialized Form

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`double`	`adjustedRSquared()` Returns adjusted R² statistic.
`double[]`	`coefficients()` Returns the linear coefficients (without intercept).
`int`	`df()` Returns the degree-of-freedom of residual standard error.
`double`	`error()` Returns the residual standard error.
`double[]`	`fittedValues()` Returns the fitted values.
`smile.data.formula.Formula`	`formula()` Returns the formula associated with the model.
`double`	`ftest()` Returns the F-statistic of goodness-of-fit.
`double`	`intercept()` Returns the intercept.
`double[]`	`predict(smile.data.DataFrame df)` Predicts the dependent variables of a data frame.
`double`	`predict(double[] x)` Predicts the dependent variable of an instance.
`double`	`predict(smile.data.Tuple x)` Predicts the dependent variable of a tuple instance.
`double`	`pvalue()` Returns the p-value of goodness-of-fit test.
`double[]`	`residuals()` Returns the residuals, that is response minus fitted values.
`double`	`RSquared()` Returns R² statistic.
`double`	`RSS()` Returns the residual sum of squares.
`smile.data.type.StructType`	`schema()` Returns the design matrix schema.
`java.lang.String`	`toString()`
`double[][]`	`ttest()` Returns the t-test of the coefficients (including intercept).
`void`	`update(smile.data.DataFrame data)` Online update the regression model with a new data frame.
`void`	`update(double[] x, double y)` Online update the regression model with a new training instance.
`void`	`update(double[] x, double y, double lambda)` Recursive least squares.
`void`	`update(smile.data.Tuple data)` Online update the regression model with a new training instance.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface smile.regression.OnlineRegression
update

Methods inherited from interface smile.regression.Regression
applyAsDouble, predict

- Method Detail
  - formula
```
public smile.data.formula.Formula formula()
```
    Description copied from interface: DataFrameRegression
    
    Returns the formula associated with the model.
    
    Specified by:
    
    formula in interface DataFrameRegression
  - schema
```
public smile.data.type.StructType schema()
```
    Description copied from interface: DataFrameRegression
    
    Returns the design matrix schema.
    
    Specified by:
    
    schema in interface DataFrameRegression
  - ttest
```
public double[][] ttest()
```
    Returns the t-test of the coefficients (including intercept). The first column is the coefficients, the second column is the standard error of coefficients, the third column is the t-score of the hypothesis test if the coefficient is zero, the fourth column is the p-values of test. The last row is of intercept.
  - coefficients
```
public double[] coefficients()
```
    Returns the linear coefficients (without intercept).
  - intercept
```
public double intercept()
```
    Returns the intercept.
  - residuals
```
public double[] residuals()
```
    Returns the residuals, that is response minus fitted values.
  - fittedValues
```
public double[] fittedValues()
```
    Returns the fitted values.
  - RSS
```
public double RSS()
```
    Returns the residual sum of squares.
  - error
```
public double error()
```
    Returns the residual standard error.
  - df
```
public int df()
```
    Returns the degree-of-freedom of residual standard error.
  - RSquared
```
public double RSquared()
```
    Returns R² statistic. In regression, the R² coefficient of determination is a statistical measure of how well the regression line approximates the real data points. An R² of 1.0 indicates that the regression line perfectly fits the data.
    In the case of ordinary least-squares regression, R² increases as we increase the number of variables in the model (R² will not decrease). This illustrates a drawback to one possible use of R², where one might try to include more variables in the model until "there is no more improvement". This leads to the alternative approach of looking at the adjusted R².
  - adjustedRSquared
```
public double adjustedRSquared()
```
    Returns adjusted R² statistic. The adjusted R² has almost same explanation as R² but it penalizes the statistic as extra variables are included in the model.
  - ftest
```
public double ftest()
```
    Returns the F-statistic of goodness-of-fit.
  - pvalue
```
public double pvalue()
```
    Returns the p-value of goodness-of-fit test.
  - predict
```
public double predict(double[] x)
```
    Description copied from interface: Regression
    
    Predicts the dependent variable of an instance.
    
    Specified by:
    
    predict in interface Regression<double[]>
    
    Parameters:
    
    x - an instance.
    
    Returns:
    
    the predicted value of dependent variable.
  - predict
```
public double predict(smile.data.Tuple x)
```
    Description copied from interface: DataFrameRegression
    
    Predicts the dependent variable of a tuple instance.
    
    Specified by:
    
    predict in interface DataFrameRegression
    
    Parameters:
    
    x - a tuple instance.
    
    Returns:
    
    the predicted value of dependent variable.
  - predict
```
public double[] predict(smile.data.DataFrame df)
```
    Description copied from interface: DataFrameRegression
    
    Predicts the dependent variables of a data frame.
    
    Specified by:
    
    predict in interface DataFrameRegression
    
    Parameters:
    
    df - the data frame.
    
    Returns:
    
    the predicted values.
  - update
```
public void update(smile.data.Tuple data)
```
    Online update the regression model with a new training instance.
  - update
```
public void update(smile.data.DataFrame data)
```
    Online update the regression model with a new data frame.
  - update
```
public void update(double[] x,
                   double y)
```
    Description copied from interface: OnlineRegression
    
    Online update the regression model with a new training instance. In general, this method may be NOT multi-thread safe.
    
    Specified by:
    
    update in interface OnlineRegression<double[]>
    
    Parameters:
    
    x - training instance.
    
    y - response variable.
  - update
```
public void update(double[] x,
                   double y,
                   double lambda)
```
    Recursive least squares. RLS updates an ordinary least squares with samples that arrive sequentially. In some adaptive configurations it can be useful not to give equal importance to all the historical data but to assign higher weights to the most recent data (and then to forget the oldest one). This may happen when the phenomenon underlying the data is non stationary or when we want to approximate a nonlinear dependence by using a linear model which is local in time. Both these situations are common in adaptive control problems.
    References
    1. https://www.otexts.org/1582
    Parameters:
    
    x - training instance.
    
    y - response variable.
    
    lambda - The forgetting factor in (0, 1]. Values closer to 1 will have longer memory and values closer to 0 will be have shorter memory.
  - toString
```
public java.lang.String toString()
```
    Overrides:
    
    toString in class java.lang.Object

Class LinearModel

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface smile.regression.OnlineRegression

Methods inherited from interface smile.regression.Regression

Method Detail

formula

schema

ttest

coefficients

intercept

residuals

fittedValues

RSS

error

df

RSquared

adjustedRSquared

ftest

pvalue

predict

predict

predict

update

update

update

update

References

toString