Package

ml.combust.mleap.core

feature

Permalink

package feature

Visibility
  1. Public
  2. All

Type Members

  1. case class BucketizerModel(splits: Array[Double]) extends Serializable with Product

    Permalink

    Class for a bucketizer model.

    Class for a bucketizer model.

    Bucketizer will place incoming feature into a bucket.

    splits

    splits used to determine bucket

    Annotations
    @SparkCode()
  2. case class ElementwiseProductModel(scalingVec: Vector) extends Serializable with Product

    Permalink

    Class for an element wise product model.

    Class for an element wise product model.

    scalingVec

    vector for scaling feature vectors

    Annotations
    @SparkCode()
  3. case class HashingTermFrequencyModel(numFeatures: Int = 1 << 18, binary: Boolean = false) extends Product with Serializable

    Permalink

    Class for hashing token frequencies into a vector.

    Class for hashing token frequencies into a vector.

    Source adapted from: Apache Spark Utils and HashingTF, see NOTICE for contributors

    numFeatures

    size of feature vector to hash into

    Annotations
    @SparkCode()
  4. case class MaxAbsScalerModel(maxAbs: Vector) extends Serializable with Product

    Permalink

    Class for MaxAbs Scaler model.

    Class for MaxAbs Scaler model.

    maxAbs

    max absolute value

    Annotations
    @SparkCode()
  5. case class MinMaxScalerModel(originalMin: Vector, originalMax: Vector) extends Serializable with Product

    Permalink

    Class for MinMax Scaler Transformer

    Class for MinMax Scaler Transformer

    MinMax Scaler will use the Min/Max values to scale input features.

    originalMin

    minimum values from training features

    originalMax

    maximum values from training features

    Annotations
    @SparkCode()
  6. case class NGramModel(n: Int) extends Serializable with Product

    Permalink

    Created by mikhail on 9/29/16.

  7. case class NormalizerModel(pNorm: Double) extends Serializable with Product

    Permalink

    Class for storing a normalizer model.

    Class for storing a normalizer model.

    pNorm

    p normalization param

    Annotations
    @SparkCode()
  8. case class OneHotEncoderModel(size: Int) extends Serializable with Product

    Permalink

    Class for a one hot encoder model.

    Class for a one hot encoder model.

    One hot encoders are used to vectorize nominal features in preparation for models such as linear regression or logistic regression where binary and not multinomial features are supported in the feature vector.

    size

    size of the output one hot vectors

  9. case class PcaModel(principalComponents: DenseMatrix) extends Product with Serializable

    Permalink

    Class for principal components analysis model.

    Class for principal components analysis model.

    principalComponents

    matrix of principal components

  10. case class PolynomialExpansionModel(degree: Int) extends Serializable with Product

    Permalink

    Created by mikhail on 10/16/16.

    Created by mikhail on 10/16/16.

    Annotations
    @SparkCode()
  11. case class ReverseStringIndexerModel(labels: Seq[String]) extends Product with Serializable

    Permalink

    Class for a reverse string indexer model.

    Class for a reverse string indexer model.

    This model reverses the StringIndexerModel model. Use this to go from an integer representation of a label to a string.

    labels

    labels for reverse string indexing

  12. case class StandardScalerModel(std: Option[Vector], mean: Option[Vector]) extends Serializable with Product

    Permalink

    Class for standard scaler models.

    Class for standard scaler models.

    Standard scaler will use stddev, mean, or both to scale a feature vector down.

    std

    optional standard deviations of features

    mean

    optional means of features

    Annotations
    @SparkCode()
  13. case class StopWordsRemoverModel(stopWords: Array[String], caseSensitive: Boolean) extends Serializable with Product

    Permalink

    Created by mikhail on 10/16/16.

  14. case class StringIndexerModel(labels: Seq[String]) extends Serializable with Product

    Permalink

    Class for string indexer model.

    Class for string indexer model.

    String indexer converts a string into an integer representation.

    labels

    list of labels that can be indexed

  15. case class TokenizerModel(regex: String = "\\s") extends Product with Serializable

    Permalink

    Class for a tokenizer model.

    Class for a tokenizer model.

    Default regular expression for tokenizing strings is defined by TokenizerModel.defaultTokenizer

    regex

    regular expression used for tokenizing strings

  16. case class VectorAssemblerModel() extends Serializable with Product

    Permalink

    Class for a vector assembler model.

    Class for a vector assembler model.

    Vector assemblers take an input set of doubles and vectors and create a new vector out of them. This is primarily used to get all desired features into one vector before training a model.

    Annotations
    @SparkCode()

Value Members

  1. object HashingTermFrequencyModel extends Serializable

    Permalink
  2. object TokenizerModel extends Serializable

    Permalink

    Companion object for defaults.

  3. object VectorAssemblerModel extends Serializable

    Permalink

    Companion object for defaults.

Ungrouped