case class AdamW(parameters: Seq[(STen, PTag)], weightDecay: OptimizerHyperparameter, learningRate: OptimizerHyperparameter, beta1: OptimizerHyperparameter, beta2: OptimizerHyperparameter, eps: Double, clip: Option[Double], debias: Boolean) extends Optimizer
- See also:
https://arxiv.org/pdf/1711.05101.pdf Algorithm 2
- Companion:
- object