public class RegressionTree extends CART implements Regression<smile.data.Tuple>, DataFrameRegression
Classification and Regression Tree techniques have a number of advantages over many of those alternative techniques.
Some techniques such as bagging, boosting, and random forest use more than one decision tree for their analysis.
GradientTreeBoost
,
RandomForest
,
Serialized FormRegression.Metric
Constructor and Description |
---|
RegressionTree(smile.data.DataFrame x,
Loss loss,
smile.data.type.StructField response,
int maxDepth,
int maxNodes,
int nodeSize,
int mtry,
int[] samples,
int[][] order)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
protected java.util.Optional<Split> |
findBestSplit(LeafNode leaf,
int j,
double impurity,
int lo,
int hi)
Finds the best split for given column.
|
static RegressionTree |
fit(smile.data.formula.Formula formula,
smile.data.DataFrame data)
Learns a regression tree.
|
static RegressionTree |
fit(smile.data.formula.Formula formula,
smile.data.DataFrame data,
int maxDepth,
int maxNodes,
int nodeSize)
Learns a regression tree.
|
static RegressionTree |
fit(smile.data.formula.Formula formula,
smile.data.DataFrame data,
java.util.Properties prop)
Learns a regression tree.
|
smile.data.formula.Formula |
formula()
Returns null if the tree is part of ensemble algorithm.
|
protected double |
impurity(LeafNode node)
Returns the impurity of node.
|
protected LeafNode |
newNode(int[] nodeSamples)
Creates a new leaf node.
|
double |
predict(smile.data.Tuple x)
Predicts the dependent variable of an instance.
|
smile.data.type.StructType |
schema()
Returns the schema of predictors.
|
clear, dot, findBestSplit, importance, order, predictors, root, shap, shap, size, split, toString
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
applyAsDouble, metric, metric, predict
predict
public RegressionTree(smile.data.DataFrame x, Loss loss, smile.data.type.StructField response, int maxDepth, int maxNodes, int nodeSize, int mtry, int[] samples, int[][] order)
x
- the data frame of the explanatory variable.loss
- the loss function.response
- the metadata of response variable.maxDepth
- the maximum depth of the tree.maxNodes
- the maximum number of leaf nodes in the tree.nodeSize
- the minimum size of leaf nodes.mtry
- the number of input variables to pick to split on at each
node. It seems that sqrt(p) give generally good performance,
where p is the number of variables.samples
- the sample set of instances for stochastic learning.
samples[i] is the number of sampling for instance i.order
- the index of training values in ascending order. Note
that only numeric attributes need be sorted.protected double impurity(LeafNode node)
CART
protected LeafNode newNode(int[] nodeSamples)
CART
protected java.util.Optional<Split> findBestSplit(LeafNode leaf, int j, double impurity, int lo, int hi)
CART
findBestSplit
in class CART
public static RegressionTree fit(smile.data.formula.Formula formula, smile.data.DataFrame data)
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.public static RegressionTree fit(smile.data.formula.Formula formula, smile.data.DataFrame data, java.util.Properties prop)
prop
include
smile.cart.node.size
smile.cart.max.nodes
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.prop
- Training algorithm hyper-parameters and properties.public static RegressionTree fit(smile.data.formula.Formula formula, smile.data.DataFrame data, int maxDepth, int maxNodes, int nodeSize)
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.maxDepth
- the maximum depth of the tree.maxNodes
- the maximum number of leaf nodes in the tree.nodeSize
- the minimum size of leaf nodes.public double predict(smile.data.Tuple x)
Regression
predict
in interface DataFrameRegression
predict
in interface Regression<smile.data.Tuple>
x
- an instance.public smile.data.formula.Formula formula()
formula
in interface DataFrameRegression
public smile.data.type.StructType schema()
DataFrameRegression
schema
in interface DataFrameRegression