Class SimpleImputer

java.lang.Object
smile.feature.imputation.SimpleImputer
All Implemented Interfaces:
Serializable, Function<smile.data.Tuple,smile.data.Tuple>, smile.data.transform.Transform

public class SimpleImputer extends Object implements smile.data.transform.Transform
Simple algorithm replaces missing values with the constant value along each column.
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
    Constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    smile.data.DataFrame
    apply(smile.data.DataFrame data)
     
    smile.data.Tuple
    apply(smile.data.Tuple x)
     
    fit(smile.data.DataFrame data, double lower, double upper, String... columns)
    Fits the missing value imputation values.
    fit(smile.data.DataFrame data, String... columns)
    Fits the missing value imputation values.
    static boolean
    hasMissing(smile.data.Tuple x)
    Return true if the tuple x has missing values.
    static double[][]
    impute(double[][] data)
    Impute the missing values with column averages.
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

    Methods inherited from interface java.util.function.Function

    andThen, compose

    Methods inherited from interface smile.data.transform.Transform

    andThen, compose
  • Constructor Details

    • SimpleImputer

      public SimpleImputer(Map<String,Object> values)
      Constructor.
      Parameters:
      values - the map of column name to the constant value.
  • Method Details

    • hasMissing

      public static boolean hasMissing(smile.data.Tuple x)
      Return true if the tuple x has missing values.
      Parameters:
      x - a tuple.
      Returns:
      true if the tuple x has missing values.
    • apply

      public smile.data.Tuple apply(smile.data.Tuple x)
      Specified by:
      apply in interface Function<smile.data.Tuple,smile.data.Tuple>
    • apply

      public smile.data.DataFrame apply(smile.data.DataFrame data)
      Specified by:
      apply in interface smile.data.transform.Transform
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • fit

      public static SimpleImputer fit(smile.data.DataFrame data, String... columns)
      Fits the missing value imputation values. Impute all the numeric columns with median, boolean/nominal columns with mode, and text columns with empty string.
      Parameters:
      data - the training data.
      columns - the columns to impute. If empty, impute all the applicable columns.
      Returns:
      the imputer.
    • fit

      public static SimpleImputer fit(smile.data.DataFrame data, double lower, double upper, String... columns)
      Fits the missing value imputation values. Impute all the numeric columns with the mean of values in the range [lower, upper], boolean/nominal columns with mode, and text columns with empty string.
      Parameters:
      data - the training data.
      lower - the lower limit in terms of percentiles of the original distribution (e.g. 5th percentile).
      upper - the upper limit in terms of percentiles of the original distribution (e.g. 95th percentile).
      columns - the columns to impute. If empty, impute all the applicable columns.
      Returns:
      the imputer.
    • impute

      public static double[][] impute(double[][] data)
      Impute the missing values with column averages.
      Parameters:
      data - data with missing values.
      Returns:
      the imputed data.
      Throws:
      IllegalArgumentException - when the whole row or column is missing.