public interface CrossValidation
Modifier and Type | Method and Description |
---|---|
static <M extends DataFrameClassifier> |
classification(int k,
smile.data.formula.Formula formula,
smile.data.DataFrame data,
java.util.function.BiFunction<smile.data.formula.Formula,smile.data.DataFrame,M> trainer)
Runs classification cross validation.
|
static <T,M extends Classifier<T>> |
classification(int k,
T[] x,
int[] y,
java.util.function.BiFunction<T[],int[],M> trainer)
Runs classification cross validation.
|
static Bag[] |
nonoverlap(int[] group,
int k)
Cross validation with non-overlapping groups.
|
static Bag[] |
of(int[] category,
int k)
Cross validation with stratified folds.
|
static Bag[] |
of(int n,
int k)
Creates a k-fold cross validation.
|
static Bag[] |
of(int n,
int k,
boolean shuffle)
Creates a k-fold cross validation.
|
static <M extends DataFrameRegression> |
regression(int k,
smile.data.formula.Formula formula,
smile.data.DataFrame data,
java.util.function.BiFunction<smile.data.formula.Formula,smile.data.DataFrame,M> trainer)
Runs regression cross validation.
|
static <T,M extends Regression<T>> |
regression(int k,
T[] x,
double[] y,
java.util.function.BiFunction<T[],double[],M> trainer)
Runs regression cross validation.
|
static Bag[] of(int n, int k)
n
- the number of samples.k
- the number of rounds of cross validation.static Bag[] of(int n, int k, boolean shuffle)
n
- the number of samples.k
- the number of rounds of cross validation.shuffle
- whether to shuffle samples before splitting.static Bag[] of(int[] category, int k)
category
- the strata labels.k
- the number of folds.static Bag[] nonoverlap(int[] group, int k)
This is useful when the i.i.d. assumption is known to be broken by the underlying process generating the data. For example, when we have multiple samples by the same user and want to make sure that the model doesn't learn user-specific features that don't generalize to unseen users, this approach could be used.
group
- the group labels of the samples.k
- the number of folds.static <T,M extends Classifier<T>> ClassificationValidations<M> classification(int k, T[] x, int[] y, java.util.function.BiFunction<T[],int[],M> trainer)
k
- k-fold cross validation.x
- the samples.y
- the sample labels.trainer
- the lambda to train a model.static <M extends DataFrameClassifier> ClassificationValidations<M> classification(int k, smile.data.formula.Formula formula, smile.data.DataFrame data, java.util.function.BiFunction<smile.data.formula.Formula,smile.data.DataFrame,M> trainer)
k
- k-fold cross validation.formula
- the model specification.data
- the training/validation data.trainer
- the lambda to train a model.static <T,M extends Regression<T>> RegressionValidations<M> regression(int k, T[] x, double[] y, java.util.function.BiFunction<T[],double[],M> trainer)
k
- k-fold cross validation.x
- the samples.y
- the response variable.trainer
- the lambda to train a model.static <M extends DataFrameRegression> RegressionValidations<M> regression(int k, smile.data.formula.Formula formula, smile.data.DataFrame data, java.util.function.BiFunction<smile.data.formula.Formula,smile.data.DataFrame,M> trainer)
k
- k-fold cross validation.formula
- the model specification.data
- the training/validation data.trainer
- the lambda to train a model.