public class RegressionTree extends CART implements Regression<smile.data.Tuple>, DataFrameRegression
Classification and Regression Tree techniques have a number of advantages over many of those alternative techniques.
Some techniques such as bagging, boosting, and random forest use more than one decision tree for their analysis.
GradientTreeBoost,
RandomForest,
Serialized FormRegression.Metric| Constructor and Description |
|---|
RegressionTree(smile.data.DataFrame x,
Loss loss,
smile.data.type.StructField response,
int maxDepth,
int maxNodes,
int nodeSize,
int mtry,
int[] samples,
int[][] order)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
protected java.util.Optional<Split> |
findBestSplit(LeafNode leaf,
int j,
double impurity,
int lo,
int hi)
Finds the best split for given column.
|
static RegressionTree |
fit(smile.data.formula.Formula formula,
smile.data.DataFrame data)
Learns a regression tree.
|
static RegressionTree |
fit(smile.data.formula.Formula formula,
smile.data.DataFrame data,
int maxDepth,
int maxNodes,
int nodeSize)
Learns a regression tree.
|
static RegressionTree |
fit(smile.data.formula.Formula formula,
smile.data.DataFrame data,
java.util.Properties prop)
Learns a regression tree.
|
smile.data.formula.Formula |
formula()
Returns null if the tree is part of ensemble algorithm.
|
protected double |
impurity(LeafNode node)
Returns the impurity of node.
|
protected LeafNode |
newNode(int[] nodeSamples)
Creates a new leaf node.
|
double |
predict(smile.data.Tuple x)
Predicts the dependent variable of an instance.
|
smile.data.type.StructType |
schema()
Returns the schema of predictors.
|
clear, dot, findBestSplit, importance, order, predictors, root, shap, shap, size, split, toStringclone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitapplyAsDouble, metric, metric, predictpredictpublic RegressionTree(smile.data.DataFrame x,
Loss loss,
smile.data.type.StructField response,
int maxDepth,
int maxNodes,
int nodeSize,
int mtry,
int[] samples,
int[][] order)
x - the data frame of the explanatory variable.loss - the loss function.response - the metadata of response variable.maxDepth - the maximum depth of the tree.maxNodes - the maximum number of leaf nodes in the tree.nodeSize - the minimum size of leaf nodes.mtry - the number of input variables to pick to split on at each
node. It seems that sqrt(p) give generally good performance,
where p is the number of variables.samples - the sample set of instances for stochastic learning.
samples[i] is the number of sampling for instance i.order - the index of training values in ascending order. Note
that only numeric attributes need be sorted.protected double impurity(LeafNode node)
CARTprotected LeafNode newNode(int[] nodeSamples)
CARTprotected java.util.Optional<Split> findBestSplit(LeafNode leaf, int j, double impurity, int lo, int hi)
CARTfindBestSplit in class CARTpublic static RegressionTree fit(smile.data.formula.Formula formula, smile.data.DataFrame data)
formula - a symbolic description of the model to be fitted.data - the data frame of the explanatory and response variables.public static RegressionTree fit(smile.data.formula.Formula formula, smile.data.DataFrame data, java.util.Properties prop)
prop include
smile.cart.node.size
smile.cart.max.nodes
formula - a symbolic description of the model to be fitted.data - the data frame of the explanatory and response variables.prop - Training algorithm hyper-parameters and properties.public static RegressionTree fit(smile.data.formula.Formula formula, smile.data.DataFrame data, int maxDepth, int maxNodes, int nodeSize)
formula - a symbolic description of the model to be fitted.data - the data frame of the explanatory and response variables.maxDepth - the maximum depth of the tree.maxNodes - the maximum number of leaf nodes in the tree.nodeSize - the minimum size of leaf nodes.public double predict(smile.data.Tuple x)
Regressionpredict in interface DataFrameRegressionpredict in interface Regression<smile.data.Tuple>x - an instance.public smile.data.formula.Formula formula()
formula in interface DataFrameRegressionpublic smile.data.type.StructType schema()
DataFrameRegressionschema in interface DataFrameRegression