Gaussian dropout. This is a multiplicative Gaussian noise (mean 1) on the input activations.
Each input activation x is independently set to:
x <- x * y, where y ~ N(1, stdev = sqrt((1-rate)/rate))
Dropout schedules (i.e., varying probability p as a function of iteration/epoch) are also supported.
Note 1: As per all IDropout instances, GaussianDropout is applied at training time only - and is automatically not
applied at test time (for evaluation, etc)
Note 2: Frequently, dropout is not applied to (or, has higher retain probability for) input (first layer)
layers. Dropout is also often not applied to output layers.
See: "Multiplicative Gaussian Noise" in Srivastava et al. 2014: Dropout: A Simple Way to Prevent Neural Networks from
Overfitting
http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf