com.salesforce.op.stages.impl.feature
Computed splits
Computed splits
should or not split
computed split values
bucket labels
Compute splits using DecisionTreeClassifier
Compute splits using DecisionTreeClassifier
input dataset of (label, feature) tuples
feature name
computed Splits
Criterion used for information gain calculation (case-insensitive).
Criterion used for information gain calculation (case-insensitive). Supported: "entropy" and "gini". (default = gini)
Maximum number of bins Must be >= 2 and <= number of categories in any categorical feature.
Maximum number of bins Must be >= 2 and <= number of categories in any categorical feature. (default = 32)
Maximum depth of the tree (>= 0).
Maximum depth of the tree (>= 0). E.g., depth 0 means 1 leaf node; depth 1 means 1 internal node + 2 leaf nodes. (default = 5)
Minimum information gain for a split to be considered at a tree node.
Minimum information gain for a split to be considered at a tree node. Should be >= 0.0. (default = 0.0)
Minimum number of instances each child must have after split.
Minimum number of instances each child must have after split. If a split causes the left or right child to have fewer than minInstancesPerNode, the split will be discarded as invalid. Should be >= 1. (default = 1)