Skeleton implementation of the Coupled Simulated Annealing algorithm
Implementation of the standard back pro-pogation with momentum using the "generalized delta rule".
Solves the optimization problem pertaining to the weights of a committee model.
Implementation of the Coupled Simulated Annealing algorithm for global optimization.
Constructs a gaussian process mixture model from a single AbstractGPRegressionModel instance.
High level interface defining the core functions of a global optimizer
A common binding characteristic between all "globally optimizable" models i.e.
A common binding characteristic between all "globally optimizable" models i.e. models where hyper-parameters can be optimized/tuned.
The type of the parameters for each layer
The type of input/output patterns.
Class used to compute the gradient for a loss function, given a single data point.
Implements Gradient Descent on the graph generated to calculate approximate optimal values of the model parameters.
Implementation of SGD on spark RDD
Compute gradient and loss for a Hinge loss function, as used in SVM binary classification.
Compute gradient and loss for a Hinge loss function, as used in SVM binary classification. NOTE: This assumes that the labels are {0,1}
Updater for L1 regularized problems.
Updater for L1 regularized problems. R(w) = ||w||_1 Uses a step-size decreasing with the square root of the number of iterations.
Instead of subgradient of the regularizer, the proximal operator for the L1 regularization is applied after the gradient step. This is known to result in better sparsity of the intermediate solution.
The corresponding proximal operator for the L1 norm is the soft-thresholding function. That is, each weight component is shrunk towards 0 by shrinkageVal.
If w > shrinkageVal, set weight component to w-shrinkageVal. If w < -shrinkageVal, set weight component to w+shrinkageVal. If -shrinkageVal < w < shrinkageVal, set weight component to 0.
Equivalently, set weight component to signum(w) * max(0.0, abs(w) - shrinkageVal)
Solves the linear problem resulting from applying the Karush-Kuhn-Tucker conditions on the Dual Least Squares SVM optimization problem.
Created by mandar on 6/4/16.
Compute gradient and loss for a Least-squared loss function, as used in linear regression.
Compute gradient and loss for a Least-squared loss function, as used in linear regression. This is correct for the averaged least squares loss function (mean squared error) L = 1/2 ||weights . phi(x) - y||**2
Compute gradient and loss for a Least-squared loss function, as used in LS SVM.
Compute gradient and loss for a Least-squared loss function, as used in LS SVM. This is correct for the averaged least squares loss function (mean squared error) L = 1/2 (1 - y * weights dot x)**2 See also the documentation for the precise formulation.
Compute gradient and loss for a logistic loss function, as used in binary classification.
A model tuner takes a model which implements GloballyOptimizable and "tunes" it, returning (possibly) a model of a different type.
An updater which uses both the local gradient and the inertia of the system to perform update to its parameters.
Trait for optimization problem solvers.
Trait for optimization problem solvers.
The type of the parameters of the model to be optimized.
The type of the predictor variable
The type of the target variable
The type of the edge containing the features and label.
Build GP committee model after performing the CSA routine
Updater for L2 regularized problems.
Updater for L2 regularized problems. R(w) = 1/2 ||w||**2 Uses a step-size decreasing with the square root of the number of iterations.
Constructs a gaussian process mixture model from a single AbstractGPRegressionModel instance.
The type of the GP training data
The index set/input domain of the GP model.