public class LDA extends java.lang.Object implements SoftClassifier<double[]>
LDA is closely related to ANOVA (analysis of variance) and linear regression analysis, which also attempt to express one dependent variable as a linear combination of other features or measurements. In the other two methods, however, the dependent variable is a numerical quantity, while for LDA it is a categorical variable (i.e. the class label). Logistic regression and probit regression are more similar to LDA, as they also explain a categorical variable. These other methods are preferable in applications where it is not reasonable to assume that the independent variables are normally distributed, which is a fundamental assumption of the LDA method.
One complication in applying LDA (and Fisher's discriminant) to real data occurs when the number of variables/features does not exceed the number of samples. In this case, the covariance estimates do not have full rank, and so cannot be inverted. This is known as small sample size problem.
FLD
,
QDA
,
RDA
,
NaiveBayes
,
Serialized FormConstructor and Description |
---|
LDA(double[] priori,
double[][] mu,
double[] eigen,
smile.math.matrix.DenseMatrix scaling)
Constructor.
|
LDA(double[] priori,
double[][] mu,
double[] eigen,
smile.math.matrix.DenseMatrix scaling,
smile.util.IntSet labels)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
static LDA |
fit(double[][] x,
int[] y)
Learns linear discriminant analysis.
|
static LDA |
fit(double[][] x,
int[] y,
double[] priori,
double tol)
Learns linear discriminant analysis.
|
static LDA |
fit(double[][] x,
int[] y,
java.util.Properties prop)
Learns linear discriminant analysis.
|
static LDA |
fit(smile.data.formula.Formula formula,
smile.data.DataFrame data)
Learns linear discriminant analysis.
|
static LDA |
fit(smile.data.formula.Formula formula,
smile.data.DataFrame data,
java.util.Properties prop)
Learns linear discriminant analysis.
|
int |
predict(double[] x)
Predicts the class label of an instance.
|
int |
predict(double[] x,
double[] posteriori)
Predicts the class label of an instance and also calculate a posteriori
probabilities.
|
double[] |
priori()
Returns a priori probabilities.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
applyAsDouble, applyAsInt, f, predict
public LDA(double[] priori, double[][] mu, double[] eigen, smile.math.matrix.DenseMatrix scaling)
priori
- a priori probabilities of each class.mu
- the mean vectors of each class.eigen
- the eigen values of common variance matrix.scaling
- the eigen vectors of common covariance matrix.public LDA(double[] priori, double[][] mu, double[] eigen, smile.math.matrix.DenseMatrix scaling, smile.util.IntSet labels)
priori
- a priori probabilities of each class.mu
- the mean vectors of each class.eigen
- the eigen values of common variance matrix.scaling
- the eigen vectors of common covariance matrix.labels
- class labelspublic static LDA fit(smile.data.formula.Formula formula, smile.data.DataFrame data)
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.public static LDA fit(smile.data.formula.Formula formula, smile.data.DataFrame data, java.util.Properties prop)
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.public static LDA fit(double[][] x, int[] y)
x
- training samples.y
- training labels in [0, k), where k is the number of classes.public static LDA fit(double[][] x, int[] y, java.util.Properties prop)
x
- training samples.y
- training labels.public static LDA fit(double[][] x, int[] y, double[] priori, double tol)
x
- training samples.y
- training labels.priori
- the priori probability of each class. If null, it will be
estimated from the training data.tol
- a tolerance to decide if a covariance matrix is singular; it
will reject variables whose variance is less than tol2.public double[] priori()
public int predict(double[] x)
Classifier
predict
in interface Classifier<double[]>
x
- the instance to be classified.public int predict(double[] x, double[] posteriori)
SoftClassifier
predict
in interface SoftClassifier<double[]>
x
- an instance to be classified.posteriori
- the array to store a posteriori probabilities on output.