util

Type Members

case class AllSparseFeatures[T]()(implicit evidence$1: ClassTag[T]) extends Estimator[Seq[(T, Double)], SparseVector[Double]] with Product with Serializable

An Estimator that chooses all sparse features observed when training, and produces a transformer which builds a sparse vector out of them.
An Estimator that chooses all sparse features observed when training, and produces a transformer which builds a sparse vector out of them.
Deterministically orders the feature mappings by earliest appearance in the RDD
case class Cacher[T](name: Option[String] = None)(implicit evidence$1: ClassTag[T]) extends Transformer[T, T] with Logging with Product with Serializable

Caches an RDD at a given point within a Pipeline.
Caches an RDD at a given point within a Pipeline. Follows Spark's lazy evaluation conventions.
T
Type of the input to cache.
name
An optional name to set on the cached output. Useful for debugging.
case class ClassLabelIndicatorsFromIntArrayLabels(numClasses: Int, validate: Boolean = false) extends Transformer[Array[Int], DenseVector[Double]] with Product with Serializable

Given a set of class labels, returns a binary vector that indicates when each class is present.
Given a set of class labels, returns a binary vector that indicates when each class is present.
Expects labels in the range [0, numClasses) and numClasses > 1.
case class ClassLabelIndicatorsFromIntLabels(numClasses: Int) extends Transformer[Int, DenseVector[Double]] with Product with Serializable

Given a class label, returns a binary vector that indicates when that class is present.
Given a class label, returns a binary vector that indicates when that class is present.
Expects labels in the range [0, numClasses) and numClasses > 1.
case class CommonSparseFeatures[T](numFeatures: Int)(implicit evidence$1: ClassTag[T]) extends Estimator[Seq[(T, Double)], SparseVector[Double]] with Product with Serializable

An Estimator that chooses the most frequently observed sparse features when training, and produces a transformer which builds a sparse vector out of them
An Estimator that chooses the most frequently observed sparse features when training, and produces a transformer which builds a sparse vector out of them
Deterministically orders the feature mappings first by decreasing number of appearances, then by earliest appearance in the RDD
numFeatures
The number of features to keep
case class Densify[T <: Vector[Double]]() extends Transformer[T, DenseVector[Double]] with Product with Serializable

Transformer to densify vectors into DenseVectors.
class Identity[T] extends Transformer[T, T]

This class performs a no-op on its input.
This class performs a no-op on its input.
T
Type of the input and, by definition, output.
class Shuffler[T] extends Transformer[T, T] with Logging

Randomly shuffle the rows of an RDD within a pipeline.
Randomly shuffle the rows of an RDD within a pipeline. Uses a shuffle operation in Spark.
T
Type of the input to shuffle.
class SparseFeatureVectorizer[T] extends Transformer[Seq[(T, Double)], SparseVector[Double]]

A transformer which given a feature space, maps features of the form (feature id, value) into a sparse vector
case class Sparsify[T <: Vector[Double]]() extends Transformer[T, SparseVector[Double]] with Product with Serializable

Transformer to convert vectors into SparseVectors.
class TopKClassifier extends Transformer[DenseVector[Double], Array[Int]]

Transformer that returns the indices of the largest k values of the vector, in order
case class VectorCombiner[T]()(implicit evidence$1: ClassTag[T], zero: Zero[T]) extends Transformer[Seq[DenseVector[T]], DenseVector[T]] with Product with Serializable

Concats a Seq of DenseVectors into a single DenseVector.
class VectorSplitter extends FunctionNode[RDD[DenseVector[Double]], Seq[RDD[DenseVector[Double]]]]

This transformer splits the input vector into a number of blocks.

Value Members

object FloatToDouble extends Transformer[DenseMatrix[Float], DenseMatrix[Double]]

Converts float matrix to a double matrix.
object MatrixVectorizer extends Transformer[DenseMatrix[Double], DenseVector[Double]]

Flattens a matrix into a vector.
object MaxClassifier extends Transformer[DenseVector[Double], Int]

Transformer that returns the index of the largest value in the vector
object TopKClassifier extends Serializable

Object to allow creating top k classifier w/o new

package util

Type Members

case class AllSparseFeatures[T]()(implicit evidence$1: ClassTag[T]) extends Estimator[Seq[(T, Double)], SparseVector[Double]] with Product with Serializable

case class Cacher[T](name: Option[String] = None)(implicit evidence$1: ClassTag[T]) extends Transformer[T, T] with Logging with Product with Serializable

case class ClassLabelIndicatorsFromIntArrayLabels(numClasses: Int, validate: Boolean = false) extends Transformer[Array[Int], DenseVector[Double]] with Product with Serializable

case class ClassLabelIndicatorsFromIntLabels(numClasses: Int) extends Transformer[Int, DenseVector[Double]] with Product with Serializable

case class CommonSparseFeatures[T](numFeatures: Int)(implicit evidence$1: ClassTag[T]) extends Estimator[Seq[(T, Double)], SparseVector[Double]] with Product with Serializable

case class Densify[T <: Vector[Double]]() extends Transformer[T, DenseVector[Double]] with Product with Serializable

class Identity[T] extends Transformer[T, T]

class Shuffler[T] extends Transformer[T, T] with Logging

class SparseFeatureVectorizer[T] extends Transformer[Seq[(T, Double)], SparseVector[Double]]

case class Sparsify[T <: Vector[Double]]() extends Transformer[T, SparseVector[Double]] with Product with Serializable

class TopKClassifier extends Transformer[DenseVector[Double], Array[Int]]

case class VectorCombiner[T]()(implicit evidence$1: ClassTag[T], zero: Zero[T]) extends Transformer[Seq[DenseVector[T]], DenseVector[T]] with Product with Serializable

class VectorSplitter extends FunctionNode[RDD[DenseVector[Double]], Seq[RDD[DenseVector[Double]]]]

Value Members

object FloatToDouble extends Transformer[DenseMatrix[Float], DenseMatrix[Double]]

object MatrixVectorizer extends Transformer[DenseMatrix[Double], DenseVector[Double]]

object MaxClassifier extends Transformer[DenseVector[Double], Int]

object TopKClassifier extends Serializable

Ungrouped