org.allenai.nlpstack.parse.poly

decisiontree

package decisiontree

Implements C4.5 decision trees for integral labels and attributes.

Main class to use is org.allenai.nlpstack.parse.poly.decisiontree.DecisionTree. Use the companion object to build the tree. Then use ) or ) to do prediction.

The tree takes data in the form of org.allenai.nlpstack.parse.poly.decisiontree.FeatureVectors. This is a container for a collection of org.allenai.nlpstack.parse.poly.decisiontree.FeatureVector objects.

Implementations of these are org.allenai.nlpstack.parse.poly.decisiontree.SparseVector or org.allenai.nlpstack.parse.poly.decisiontree.DenseVector.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. decisiontree
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Type Members

  1. case class DecisionTree(child: IndexedSeq[Seq[(Int, Int)]], splittingAttribute: IndexedSeq[Option[Int]], categoryCounts: IndexedSeq[Seq[(Int, Int)]]) extends Product with Serializable

    Immutable decision tree for integer-valued features and categories.

    Immutable decision tree for integer-valued features and categories.

    Each data structure is an indexed sequence of properties. The ith element of each sequence is the property of node i of the decision tree.

    child

    stores the children of each node (as a map from attribute values to node ids)

    splittingAttribute

    stores the attribute that each node splits on; can be None for leaf nodes

    categoryCounts

    for each node, stores a map of categories to their frequency of appearance at that node (i.e. how many times a training vector with that category makes it to this node during classification)

  2. case class DenseVector(label: Option[Int], attributes: IndexedSeq[Int]) extends FeatureVector with Product with Serializable

    Instance with arbitrary integral attributes

    Instance with arbitrary integral attributes

    label

    label of instance

    attributes

    value of each attribute

  3. sealed trait FeatureVector extends AnyRef

    A feature vector with integral features and label.

  4. case class FeatureVectors(featureVectors: IndexedSeq[FeatureVector]) extends Product with Serializable

    FeatureVectors is a convenience container for feature vectors.

    FeatureVectors is a convenience container for feature vectors.

    The number of attributes must be the same for all feature vectors in the container.

    featureVectors

    collection of FeatureVector objects

  5. case class SparseVector(label: Option[Int], numAttributes: Int, trueAttributes: Set[Int]) extends FeatureVector with Product with Serializable

    Instance with sparse binary attributes

    Instance with sparse binary attributes

    label

    label of instance

    numAttributes

    number of attributes

    trueAttributes

    which attributes have value 1

Value Members

  1. object DecisionTree extends Serializable

  2. object DecisionTreeTrainer

    Functions for training decision trees.

Inherited from AnyRef

Inherited from Any

Ungrouped