Compute the gradient and loss given the features of a single data point.
Compute the gradient and loss given the features of a single data point.
features for one data point
label for this data point
weights/coefficients corresponding to features
Loss vector for the current point.
Computes cumulative gradient and loss for a set of samples
Computes cumulative gradient and loss for a set of samples
RDD with vectors of features and vectors of labels
Current weights matrix.
Number of samples to collect in a batch before computing.
Routine for extracting labels vector (in some cases only part of the labels are needed)
Tuple with cumulative gradient matrix and loss vector
Computes gradient and loss for a batch containing data and labels from multiple samples.
Computes gradient and loss for a batch containing data and labels from multiple samples.
Samples matrix in row-major form (one row per sample)
Labels matrix in row-major form (one row per sample)
Matrix with weights (column-major)
Matrix with accumulated gradient
Vector with accumulated loss
Array used to cache margin calculations
Number of samples in the batch
Evaluates upper bound for regularization param for each label based on estimation from http://jmlr.org/papers/volume8/koh07a/koh07a.pdf
Evaluates upper bound for regularization param for each label based on estimation from http://jmlr.org/papers/volume8/koh07a/koh07a.pdf
RDD with samples features -> labels.
Whenever to consider last feature as a subject for regularization (set to false to exclude intercept from regularization)
A pair with instances count and regularization bounds.
Evaluates upper bound for regularization param for each label based on estimation from http://jmlr.org/papers/volume8/koh07a/koh07a.pdf
Evaluates upper bound for regularization param for each label based on estimation from http://jmlr.org/papers/volume8/koh07a/koh07a.pdf
Dataframewith samples.
Name of the features column
Name of the labels column.
Whenever to consider last feature as a subject for regularization (set to false to exclude intercept from regularization)
A pair with instances count and regularization bounds.
Implementation of the matrix LBFGS algorithm.
Implementation of the matrix LBFGS algorithm. Uses breeze implementation of the iterations and provides it with a specific cost function. The function batches requests for costs for different labels and converts to a single matrix pass.
Data fram to run on.
Name of the column with features vector. Attribute group metadata is required
Name of the column with labels vector. Attribute group metadata is required
Number of corrections in LBFGS iteration
Convergence tolerance for the iteration
Maximum number of iteration
Number of samples to batch before calculating
Map label -> trained weights vector