Immutable decision tree for integer-valued features and categories.
Immutable decision tree for integer-valued features and categories.
Each data structure is an indexed sequence of properties. The ith element of each sequence is the property of node i of the decision tree.
stores the children of each node (as a map from attribute values to node ids)
stores the attribute that each node splits on; can be None for leaf nodes
for each node, stores a map of categories to their frequency of appearance at that node (i.e. how many times a training vector with that category makes it to this node during classification)
Instance with arbitrary integral attributes
Instance with arbitrary integral attributes
label of instance
value of each attribute
A feature vector with integral features and label.
FeatureVectors is a convenience container for feature vectors.
FeatureVectors is a convenience container for feature vectors.
The number of attributes must be the same for all feature vectors in the container.
collection of FeatureVector objects
Instance with sparse binary attributes
Instance with sparse binary attributes
label of instance
number of attributes
which attributes have value 1
Functions for training decision trees.
Implements C4.5 decision trees for integral labels and attributes.
Main class to use is org.allenai.nlpstack.parse.poly.decisiontree.DecisionTree. Use the companion object to build the tree. Then use ) or ) to do prediction.
The tree takes data in the form of org.allenai.nlpstack.parse.poly.decisiontree.FeatureVectors. This is a container for a collection of org.allenai.nlpstack.parse.poly.decisiontree.FeatureVector objects.
Implementations of these are org.allenai.nlpstack.parse.poly.decisiontree.SparseVector or org.allenai.nlpstack.parse.poly.decisiontree.DenseVector.