Class ProbabilisticPCA

java.lang.Object
smile.feature.extraction.Projection
smile.feature.extraction.ProbabilisticPCA
All Implemented Interfaces:
Serializable, Function<smile.data.Tuple,smile.data.Tuple>, smile.data.transform.Transform

public class ProbabilisticPCA extends Projection
Probabilistic principal component analysis. Probabilistic PCA is a simplified factor analysis that employs a latent variable model with linear relationship:
     y ∼ W * x + μ + ε
 
where latent variables x ∼ N(0, I), error (or noise) ε ∼ N(0, Ψ), and μ is the location term (mean). In probabilistic PCA, an isotropic noise model is used, i.e., noise variances constrained to be equal (Ψi = σ2). A close form of estimation of above parameters can be obtained by maximum likelihood method.

References

  1. Michael E. Tipping and Christopher M. Bishop. Probabilistic Principal Component Analysis. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 61(3):611-622, 1999.
See Also:
  • Field Summary

    Fields inherited from class smile.feature.extraction.Projection

    columns, projection, schema
  • Constructor Summary

    Constructors
    Constructor
    Description
    ProbabilisticPCA(double noise, double[] mu, smile.math.matrix.Matrix loading, smile.math.matrix.Matrix projection, String... columns)
    Constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    double[]
    Returns the center of data.
    fit(double[][] data, int k, String... columns)
    Fits probabilistic principal component analysis.
    fit(smile.data.DataFrame data, int k, String... columns)
    Fits probabilistic principal component analysis.
    smile.math.matrix.Matrix
    Returns the variable loading matrix, ordered from largest to smallest by corresponding eigenvalues.
    protected double[]
    postprocess(double[] x)
    Postprocess the output vector after projection.
    double
    Returns the variance of noise.

    Methods inherited from class smile.feature.extraction.Projection

    apply, apply, apply, apply, preprocess

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface java.util.function.Function

    andThen, compose

    Methods inherited from interface smile.data.transform.Transform

    andThen, compose
  • Constructor Details

    • ProbabilisticPCA

      public ProbabilisticPCA(double noise, double[] mu, smile.math.matrix.Matrix loading, smile.math.matrix.Matrix projection, String... columns)
      Constructor.
      Parameters:
      noise - the variance of noise.
      mu - the mean of samples.
      loading - the loading matrix.
      projection - the projection matrix. Note that this is not the matrix W in the latent model.
      columns - the columns to transform when applied on Tuple/DataFrame.
  • Method Details

    • loadings

      public smile.math.matrix.Matrix loadings()
      Returns the variable loading matrix, ordered from largest to smallest by corresponding eigenvalues.
      Returns:
      the variable loading matrix.
    • center

      public double[] center()
      Returns the center of data.
      Returns:
      the center of data.
    • variance

      public double variance()
      Returns the variance of noise.
      Returns:
      the variance of noise.
    • postprocess

      protected double[] postprocess(double[] x)
      Description copied from class: Projection
      Postprocess the output vector after projection.
      Overrides:
      postprocess in class Projection
      Parameters:
      x - the output vector of projection.
      Returns:
      the postprocessed vector.
    • fit

      public static ProbabilisticPCA fit(smile.data.DataFrame data, int k, String... columns)
      Fits probabilistic principal component analysis.
      Parameters:
      data - training data of which each row is a sample.
      k - the number of principal component to learn.
      columns - the columns to fit PCA. If empty, all columns will be used.
      Returns:
      the model.
    • fit

      public static ProbabilisticPCA fit(double[][] data, int k, String... columns)
      Fits probabilistic principal component analysis.
      Parameters:
      data - training data of which each row is a sample.
      k - the number of principal component to learn.
      columns - the columns to transform when applied on Tuple/DataFrame.
      Returns:
      the model.