the learning rate (Default 0.03)
L1 regularization factor (Default 0.0)
L2 regularization factor (Default 0.0001)
Momentum factor for adaptive learning (Default 0.0001)
Execute the algorithm for given sequence of Δweight and sequence of weights
Execute the algorithm for given sequence of Δweight and sequence of weights
the sequence of accumulated Δweight
the sequence of current weights
L1 regularization factor (Default 0.0)
L1 regularization factor (Default 0.0)
L2 regularization factor (Default 0.0001)
L2 regularization factor (Default 0.0001)
Compute weight-loss of given weight parameters
Compute weight-loss of given weight parameters
the sequence of weight matrices
the total weight loss of this sequence
Algorithm: Stochastic Gradient Descent
Basic Gradient Descent rule with mini-batch training.