decisiontree

Type Members

case class DecisionTree(outcomes: Iterable[Int], child: IndexedSeq[Map[Int, Int]], splittingFeature: IndexedSeq[Option[Int]], outcomeHistograms: IndexedSeq[Map[Int, Int]]) extends ProbabilisticClassifier with Product with Serializable

Immutable decision tree for integer-valued features and outcomes.
Immutable decision tree for integer-valued features and outcomes.
Each data structure is an indexed sequence of properties. The ith element of each sequence is the property of node i of the decision tree.
outcomes
all possible outcomes for the decision tree
child
stores the children of each node (as a map from feature values to node ids)
splittingFeature
stores the feature that each node splits on; can be None for leaf nodes
outcomeHistograms
for each node, stores a map of outcomes to their frequency of appearance at that node (i.e. how many times a training vector with that outcome makes it to this node during classification)
class DecisionTreeTrainer extends ProbabilisticClassifierTrainer

Functions for training decision trees.
case class DenseVector(outcome: Option[Int], features: IndexedSeq[Int]) extends FeatureVector with Product with Serializable

A DenseVector is a feature vector with arbitrary integral features.
A DenseVector is a feature vector with arbitrary integral features.
outcome
the outcome of the feature vector
features
the value of each feature
sealed trait FeatureVector extends AnyRef

A feature vector with integral features and outcome.
trait FeatureVectorSource extends AnyRef
case class InMemoryFeatureVectorSource(featureVecs: IndexedSeq[FeatureVector], classificationTask: ClassificationTask) extends FeatureVectorSource with Product with Serializable

FeatureVectors is a convenience container for feature vectors.
FeatureVectors is a convenience container for feature vectors.
The number of features must be the same for all feature vectors in the container.
featureVecs
collection of FeatureVector objects
class OmnibusTrainer extends ProbabilisticClassifierTrainer
case class OneVersusAll(binaryClassifiers: Seq[(Int, ProbabilisticClassifier)]) extends ProbabilisticClassifier with Product with Serializable

The OneVersusAll implements multi-outcome classification as a set of binary classifiers.
The OneVersusAll implements multi-outcome classification as a set of binary classifiers.
A ProbabilisticClassifier is associated with each outcome. Suppose there are three outcomes: 0, 1, 2. Then the constructor would take a sequence of three classifiers as its argument: [(0,A), (1,B), (2,C)]. To compute the outcome distribution for a new feature vector v, the OneVersusAll would normalize:
[ A.outcomeDistribution(v)(1), B.outcomeDistribution(v)(1), C.outcomeDistribution(v)(1) ]
i.e. the probability of 1 (true) according to binary classifiers A, B, and C.
QUESTION(MH): is this the best way to normalize these, or would it be better to normalize by summing the logs and then re-applying the exponential operation?
binaryClassifiers
the binary classifier associated with each outcome
class OneVersusAllTrainer extends ProbabilisticClassifierTrainer

A OneVersusAllTrainer trains a OneVersusAll using a base ProbabilisticClassifierTrainer to train one binary classifier per outcome.
trait ProbabilisticClassifier extends AnyRef
trait ProbabilisticClassifierTrainer extends (FeatureVectorSource) ⇒ ProbabilisticClassifier
case class RandomForest(allOutcomes: Seq[Int], decisionTrees: Seq[DecisionTree]) extends ProbabilisticClassifier with Product with Serializable

A RandomForest is a collection of decision trees.
A RandomForest is a collection of decision trees. Each decision tree gets a single vote about the outcome. The outcome distribution is the normalized histogram of the votes.
allOutcomes
the collection of possible outcomes
decisionTrees
the collection of decision trees
class RandomForestTrainer extends ProbabilisticClassifierTrainer

A RandomForestTrainer trains a RandomForest from a set of feature vectors.
case class RemappedFeatureVectorSource(fvSource: FeatureVectorSource, outcomeRemapping: (Int) ⇒ Int) extends FeatureVectorSource with Product with Serializable
case class SparseVector(outcome: Option[Int], numFeatures: Int, trueFeatures: Set[Int]) extends FeatureVector with Product with Serializable

A SparseVector is a feature vector with sparse binary features.
A SparseVector is a feature vector with sparse binary features.
outcome
the outcome of the feature vector
numFeatures
the number of features
trueFeatures
the set of features with value 1

package decisiontree

Type Members

case class DecisionTree(outcomes: Iterable[Int], child: IndexedSeq[Map[Int, Int]], splittingFeature: IndexedSeq[Option[Int]], outcomeHistograms: IndexedSeq[Map[Int, Int]]) extends ProbabilisticClassifier with Product with Serializable

class DecisionTreeTrainer extends ProbabilisticClassifierTrainer

case class DenseVector(outcome: Option[Int], features: IndexedSeq[Int]) extends FeatureVector with Product with Serializable

sealed trait FeatureVector extends AnyRef

trait FeatureVectorSource extends AnyRef

case class InMemoryFeatureVectorSource(featureVecs: IndexedSeq[FeatureVector], classificationTask: ClassificationTask) extends FeatureVectorSource with Product with Serializable

class OmnibusTrainer extends ProbabilisticClassifierTrainer

case class OneVersusAll(binaryClassifiers: Seq[(Int, ProbabilisticClassifier)]) extends ProbabilisticClassifier with Product with Serializable

class OneVersusAllTrainer extends ProbabilisticClassifierTrainer

trait ProbabilisticClassifier extends AnyRef

trait ProbabilisticClassifierTrainer extends (FeatureVectorSource) ⇒ ProbabilisticClassifier

case class RandomForest(allOutcomes: Seq[Int], decisionTrees: Seq[DecisionTree]) extends ProbabilisticClassifier with Product with Serializable

class RandomForestTrainer extends ProbabilisticClassifierTrainer

case class RemappedFeatureVectorSource(fvSource: FeatureVectorSource, outcomeRemapping: (Int) ⇒ Int) extends FeatureVectorSource with Product with Serializable

case class SparseVector(outcome: Option[Int], numFeatures: Int, trueFeatures: Set[Int]) extends FeatureVector with Product with Serializable

Value Members

object DecisionTree extends Serializable

object ProbabilisticClassifier

object RandomForest extends Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped