public class RegressionTree extends java.lang.Object implements Regression<double[]>, java.io.Serializable
Classification and Regression Tree techniques have a number of advantages over many of those alternative techniques.
Some techniques such as bagging, boosting, and random forest use more than one decision tree for their analysis.
GradientTreeBoost
,
RandomForest
,
Serialized FormModifier and Type | Class and Description |
---|---|
static interface |
RegressionTree.NodeOutput
An interface to calculate node output.
|
static class |
RegressionTree.Trainer
Trainer for regression tree.
|
Constructor and Description |
---|
RegressionTree(smile.data.Attribute[] attributes,
double[][] x,
double[] y,
int maxNodes)
Constructor.
|
RegressionTree(smile.data.Attribute[] attributes,
double[][] x,
double[] y,
int maxNodes,
int nodeSize)
Constructor.
|
RegressionTree(smile.data.Attribute[] attributes,
double[][] x,
double[] y,
int maxNodes,
int nodeSize,
int mtry,
int[][] order,
int[] samples,
RegressionTree.NodeOutput output)
Constructor.
|
RegressionTree(double[][] x,
double[] y,
int maxNodes)
Constructor.
|
RegressionTree(double[][] x,
double[] y,
int maxNodes,
int nodeSize)
Constructor.
|
RegressionTree(int numFeatures,
int[][] x,
double[] y,
int maxNodes)
Constructor.
|
RegressionTree(int numFeatures,
int[][] x,
double[] y,
int maxNodes,
int nodeSize)
Constructor.
|
RegressionTree(int numFeatures,
int[][] x,
double[] y,
int maxNodes,
int nodeSize,
int[] samples,
RegressionTree.NodeOutput output)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
java.lang.String |
dot()
Returns the graphic representation in Graphviz dot format.
|
double[] |
importance()
Returns the variable importance.
|
int |
maxDepth()
Returns the maximum depth" of the tree -- the number of
nodes along the longest path from the root node
down to the farthest leaf node.
|
double |
predict(double[] x)
Predicts the dependent variable of an instance.
|
double |
predict(int[] x)
Predicts the dependent variable of an instance with sparse binary features.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
predict
public RegressionTree(double[][] x, double[] y, int maxNodes)
x
- the training instances.y
- the response variable.maxNodes
- the maximum number of leaf nodes in the tree.public RegressionTree(double[][] x, double[] y, int maxNodes, int nodeSize)
x
- the training instances.y
- the response variable.maxNodes
- the maximum number of leaf nodes in the tree.public RegressionTree(smile.data.Attribute[] attributes, double[][] x, double[] y, int maxNodes)
attributes
- the attribute properties.x
- the training instances.y
- the response variable.maxNodes
- the maximum number of leaf nodes in the tree.public RegressionTree(smile.data.Attribute[] attributes, double[][] x, double[] y, int maxNodes, int nodeSize)
attributes
- the attribute properties.x
- the training instances.y
- the response variable.maxNodes
- the maximum number of leaf nodes in the tree.public RegressionTree(smile.data.Attribute[] attributes, double[][] x, double[] y, int maxNodes, int nodeSize, int mtry, int[][] order, int[] samples, RegressionTree.NodeOutput output)
attributes
- the attribute properties.x
- the training instances.y
- the response variable.maxNodes
- the maximum number of leaf nodes in the tree.nodeSize
- the number of instances in a node below which the tree will
not split, setting nodeSize = 5 generally gives good results.mtry
- the number of input variables to pick to split on at each
node. It seems that p/3 give generally good performance, where p
is the number of variables.order
- the index of training values in ascending order. Note
that only numeric attributes need be sorted.samples
- the sample set of instances for stochastic learning.
samples[i] should be 0 or 1 to indicate if the instance is used for training.public RegressionTree(int numFeatures, int[][] x, double[] y, int maxNodes)
numFeatures
- the number of sparse binary features.x
- the training instances of sparse binary features.y
- the response variable.maxNodes
- the maximum number of leaf nodes in the tree.public RegressionTree(int numFeatures, int[][] x, double[] y, int maxNodes, int nodeSize)
numFeatures
- the number of sparse binary features.x
- the training instances of sparse binary features.y
- the response variable.maxNodes
- the maximum number of leaf nodes in the tree.nodeSize
- the number of instances in a node below which the tree will
not split, setting nodeSize = 5 generally gives good results.public RegressionTree(int numFeatures, int[][] x, double[] y, int maxNodes, int nodeSize, int[] samples, RegressionTree.NodeOutput output)
numFeatures
- the number of sparse binary features.x
- the training instances.y
- the response variable.maxNodes
- the maximum number of leaf nodes in the tree.nodeSize
- the number of instances in a node below which the tree will
not split, setting nodeSize = 5 generally gives good results.samples
- the sample set of instances for stochastic learning.
samples[i] should be 0 or 1 to indicate if the instance is used for training.public double[] importance()
public double predict(double[] x)
Regression
predict
in interface Regression<double[]>
x
- the instance.public double predict(int[] x)
x
- the instance.public int maxDepth()
public java.lang.String dot()