Class KernelPCA

java.lang.Object
smile.feature.extraction.Projection
smile.feature.extraction.KernelPCA
All Implemented Interfaces:
Serializable, Function<smile.data.Tuple,smile.data.Tuple>, smile.data.transform.Transform

public class KernelPCA extends Projection
Kernel PCA transform. Kernel PCA is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are done in a reproducing kernel Hilbert space with a non-linear mapping.

In practice, a large data set leads to a large Kernel/Gram matrix K, and storing K may become a problem. One way to deal with this is to perform clustering on your large dataset, and populate the kernel with the means of those clusters. Since even this method may yield a relatively large K, it is common to compute only the top P eigenvalues and eigenvectors of K.

Kernel PCA with an isotropic kernel function is closely related to metric MDS. Carrying out metric MDS on the kernel matrix K produces an equivalent configuration of points as the distance (2(1 - K(xi, xj)))1/2 computed in feature space.

Kernel PCA also has close connections with Isomap, LLE, and Laplacian eigenmaps.

References

  1. Bernhard Scholkopf, Alexander Smola, and Klaus-Robert Muller. Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Computation, 1998.
See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    final KPCA<double[]>
    Kernel PCA.

    Fields inherited from class smile.feature.extraction.Projection

    columns, projection, schema
  • Constructor Summary

    Constructors
    Constructor
    Description
    KernelPCA(KPCA<double[]> kpca, String... columns)
    Constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    double[]
    apply(double[] x)
    Project a data point to the feature space.
    static KernelPCA
    fit(smile.data.DataFrame data, smile.math.kernel.MercerKernel<double[]> kernel, int k, double threshold, String... columns)
    Fits kernel principal component analysis.
    static KernelPCA
    fit(smile.data.DataFrame data, smile.math.kernel.MercerKernel<double[]> kernel, int k, String... columns)
    Fits kernel principal component analysis.

    Methods inherited from class smile.feature.extraction.Projection

    apply, apply, apply, postprocess, preprocess

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface java.util.function.Function

    andThen, compose

    Methods inherited from interface smile.data.transform.Transform

    andThen, compose
  • Field Details

    • kpca

      public final KPCA<double[]> kpca
      Kernel PCA.
  • Constructor Details

    • KernelPCA

      public KernelPCA(KPCA<double[]> kpca, String... columns)
      Constructor.
      Parameters:
      kpca - kernel PCA object.
      columns - the columns to fit kernel PCA. If empty, all columns will be used.
  • Method Details

    • fit

      public static KernelPCA fit(smile.data.DataFrame data, smile.math.kernel.MercerKernel<double[]> kernel, int k, String... columns)
      Fits kernel principal component analysis.
      Parameters:
      data - training data.
      kernel - Mercer kernel.
      k - choose up to k principal components (larger than 0.0001) used for projection.
      columns - the columns to fit kernel PCA. If empty, all columns will be used.
      Returns:
      the model.
    • fit

      public static KernelPCA fit(smile.data.DataFrame data, smile.math.kernel.MercerKernel<double[]> kernel, int k, double threshold, String... columns)
      Fits kernel principal component analysis.
      Parameters:
      data - training data.
      kernel - Mercer kernel.
      k - choose top k principal components used for projection.
      threshold - only principal components with eigenvalues larger than the given threshold will be kept.
      columns - the columns to fit kernel PCA. If empty, all columns will be used.
      Returns:
      the model.
    • apply

      public double[] apply(double[] x)
      Description copied from class: Projection
      Project a data point to the feature space.
      Overrides:
      apply in class Projection
      Parameters:
      x - the data point.
      Returns:
      the projection in the feature space.