case class Yogi(parameters: Seq[(STen, PTag)], weightDecay: OptimizerHyperparameter, learningRate: OptimizerHyperparameter, beta1: OptimizerHyperparameter, beta2: OptimizerHyperparameter, eps: Double, clip: Option[Double], debias: Boolean) extends Optimizer
The Yogi optimizer algorithm I added the decoupled weight decay term following https://arxiv.org/pdf/1711.05101.pdf
- See also:
- Companion:
- object