public class AdaBoost extends java.lang.Object implements SoftClassifier<double[]>, java.io.Serializable
AdaBoost calls a weak classifier repeatedly in a series of rounds from total T classifiers. For each call a distribution of weights is updated that indicates the importance of examples in the data set for the classification. On each round, the weights of each incorrectly classified example are increased (or alternatively, the weights of each correctly classified example are decreased), so that the new classifier focuses more on those examples.
The basic AdaBoost algorithm is only for binary classification problem. For multi-class classification, a common approach is reducing the multi-class classification problem to multiple two-class problems. This implementation is a multi-class AdaBoost without such reductions.
Modifier and Type | Class and Description |
---|---|
static class |
AdaBoost.Trainer
Trainer for AdaBoost classifiers.
|
Constructor and Description |
---|
AdaBoost(smile.data.Attribute[] attributes,
double[][] x,
int[] y,
int ntrees)
Constructor.
|
AdaBoost(smile.data.Attribute[] attributes,
double[][] x,
int[] y,
int ntrees,
int maxNodes)
Constructor.
|
AdaBoost(double[][] x,
int[] y,
int ntrees)
Constructor.
|
AdaBoost(double[][] x,
int[] y,
int ntrees,
int maxNodes)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
DecisionTree[] |
getTrees()
Returns the decision trees.
|
double[] |
importance()
Returns the variable importance.
|
int |
predict(double[] x)
Predicts the class label of an instance.
|
int |
predict(double[] x,
double[] posteriori)
Predicts the class label of an instance and also calculate a posteriori
probabilities.
|
int |
size()
Returns the number of trees in the model.
|
double[] |
test(double[][] x,
int[] y)
Test the model on a validation dataset.
|
double[][] |
test(double[][] x,
int[] y,
ClassificationMeasure[] measures)
Test the model on a validation dataset.
|
void |
trim(int ntrees)
Trims the tree model set to a smaller size in case of over-fitting.
|
public AdaBoost(double[][] x, int[] y, int ntrees)
x
- the training instances.y
- the response variable.ntrees
- the number of trees.public AdaBoost(double[][] x, int[] y, int ntrees, int maxNodes)
x
- the training instances.y
- the response variable.ntrees
- the number of trees.maxNodes
- the maximum number of leaf nodes in the trees.public AdaBoost(smile.data.Attribute[] attributes, double[][] x, int[] y, int ntrees)
attributes
- the attribute properties.x
- the training instances.y
- the response variable.ntrees
- the number of trees.public AdaBoost(smile.data.Attribute[] attributes, double[][] x, int[] y, int ntrees, int maxNodes)
attributes
- the attribute properties.x
- the training instances.y
- the response variable.ntrees
- the number of trees.maxNodes
- the maximum number of leaf nodes in the trees.public double[] importance()
public int size()
public void trim(int ntrees)
ntrees
- the new (smaller) size of tree model set.public int predict(double[] x)
Classifier
predict
in interface Classifier<double[]>
x
- the instance to be classified.public int predict(double[] x, double[] posteriori)
predict
in interface SoftClassifier<double[]>
x
- the instance to be classified.posteriori
- the array to store a posteriori probabilities on output.public double[] test(double[][] x, int[] y)
x
- the test data set.y
- the test data response values.public double[][] test(double[][] x, int[] y, ClassificationMeasure[] measures)
x
- the test data set.y
- the test data labels.measures
- the performance measures of classification.public DecisionTree[] getTrees()