PCA (Smile Core 1.0.0 API)

java.lang.Object
- smile.projection.PCA

All Implemented Interfaces:

Projection<double[]>
```
public class PCA
extends Object
implements Projection<double[]>
```
Principal component analysis. PCA is an orthogonal linear transformation that transforms a number of possibly correlated variables into a smaller number of uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. PCA is theoretically the optimum transform for given data in least square terms. PCA can be thought of as revealing the internal structure of the data in a way which best explains the variance in the data. If a multivariate dataset is visualized as a set of coordinates in a high-dimensional data space, PCA supplies the user with a lower-dimensional picture when viewed from its (in some sense) most informative viewpoint.
PCA is mostly used as a tool in exploratory data analysis and for making predictive models. PCA involves the calculation of the eigenvalue decomposition of a data covariance matrix or singular value decomposition of a data matrix, usually after mean centering the data for each attribute. The results of a PCA are usually discussed in terms of component scores and loadings.
As a linear technique, PCA is built for several purposes: first, it enables us to decorrelate the original variables; second, to carry out data compression, where we pay decreasing attention to the numerical accuracy by which we encode the sequence of principal components; third, to reconstruct the original input data using a reduced number of variables according to a least-squares criterion; and fourth, to identify potential clusters in the data.
In certain applications, PCA can be misleading. PCA is heavily influenced when there are outliers in the data. In other situations, the linearity of PCA may be an obstacle to successful data reduction and compression.

Author:

Haifeng Li

See Also:
KPCA, PPCA, GHA

Constructor Summary

Constructors
Constructor and Description

PCA(double[][] data)
Constructor.

PCA(double[][] data, boolean cor)
Constructor.

Constructors
Constructor and Description
`PCA(double[][] data)` Constructor.
`PCA(double[][] data, boolean cor)` Constructor.

Method Summary

Methods
Modifier and Type	Method and Description
`double[]`	`getCenter()` Returns the center of data.
`double[]`	`getCumulativeVarianceProportion()` Returns the cumulative proportion of variance contained in principal components, ordered from largest to smallest.
`double[][]`	`getLoadings()` Returns the variable loading matrix, ordered from largest to smallest by corresponding eigenvalues.
`double[][]`	`getProjection()` Returns the projection matrix W.
`double[]`	`getVariance()` Returns the principal component variances, ordered from largest to smallest, which are the eigenvalues of the covariance or correlation matrix of learning data.
`double[]`	`getVarianceProportion()` Returns the proportion of variance contained in each principal component, ordered from largest to smallest.
`double[]`	`project(double[] x)` Project a data point to the feature space.
`double[][]`	`project(double[][] x)` Project a set of data toe the feature space.
`void`	`setProjection(double p)` Set the projection matrix with top principal components that contain (more than) the given percentage of variance.
`void`	`setProjection(int p)` Set the projection matrix with given number of principal components.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - PCA
```
public PCA(double[][] data)
```
    Constructor. Learn principal component analysis with covariance matrix.
  - PCA
```
public PCA(double[][] data,
   boolean cor)
```
    Constructor. Learn principal component analysis.
    
    Parameters:
    data - training data of which each row is a sample. If the sample size is larger than the data dimension and cor = false, SVD is employed for efficiency. Otherwise, eigen decomposition on covariance or correlation matrix is performed.
    cor - true if use correlation matrix instead of covariance matrix if ture.
- Method Detail
  - getCenter
```
public double[] getCenter()
```
    Returns the center of data.
  - getLoadings
```
public double[][] getLoadings()
```
    Returns the variable loading matrix, ordered from largest to smallest by corresponding eigenvalues. The matrix columns contain the eigenvectors.
  - getVariance
```
public double[] getVariance()
```
    Returns the principal component variances, ordered from largest to smallest, which are the eigenvalues of the covariance or correlation matrix of learning data.
  - getVarianceProportion
```
public double[] getVarianceProportion()
```
    Returns the proportion of variance contained in each principal component, ordered from largest to smallest.
  - getCumulativeVarianceProportion
```
public double[] getCumulativeVarianceProportion()
```
    Returns the cumulative proportion of variance contained in principal components, ordered from largest to smallest.
  - getProjection
```
public double[][] getProjection()
```
    Returns the projection matrix W. The dimension reduced data can be obtained by y = W' * x.
  - setProjection
```
public void setProjection(int p)
```
    Set the projection matrix with given number of principal components.
    
    Parameters:
    p - choose top p principal components used for projection.
  - setProjection
```
public void setProjection(double p)
```
    Set the projection matrix with top principal components that contain (more than) the given percentage of variance.
    
    Parameters:
    p - the required percentage of variance.
  - project
```
public double[] project(double[] x)
```
    Description copied from interface: Projection
    
    Project a data point to the feature space.
    
    Specified by:
    
    project in interface Projection<double[]>
  - project
```
public double[][] project(double[][] x)
```
    Description copied from interface: Projection
    
    Project a set of data toe the feature space.
    
    Specified by:
    
    project in interface Projection<double[]>

Class PCA

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

PCA

PCA

Method Detail

getCenter

getLoadings

getVariance

getVarianceProportion

getCumulativeVarianceProportion

getProjection

setProjection

setProjection

project

project