Package | Description |
---|---|
org.deeplearning4j.optimize.solvers.accumulation.encoding.threshold |
Class and Description |
---|
AdaptiveThresholdAlgorithm
An adaptive threshold algorithm used to determine the encoding threshold for distributed training.
The idea: the threshold can be too high or too low for optimal training - both cases are bad. So instead, we'll define a range of "acceptable" sparsity ratio values (default: 1e-4 to 1e-2). The sparsity ratio is defined as numValues(encodedUpdate)/numParameters If the sparsity ratio falls outside of this acceptable range, we'll either increase or decrease the threshold. The threshold changed multiplicatively using the decay rate: To increase threshold: newThreshold = decayRate * threshold To decrease threshold: newThreshold = (1.0/decayRate) * threshold The default decay rate used is AdaptiveThresholdAlgorithm.DEFAULT_DECAY_RATE =0.965936 which corresponds to an a maximum increase or
decrease of the threshold by a factor of:* 2.0 in 20 iterations * 100 in 132 iterations * 1000 in 200 iterations A high threshold leads to few values being encoded and communicated - a small "sparsity ratio". Too high threshold (too low sparsity ratio): fast network communication but slow training (few parameter updates being communicated). A low threshold leads to many values being encoded and communicated - a large "sparsity ratio". Too low threshold (too high sparsity ratio): slower network communication and maybe slow training (lots of parameter updates being communicated - but they are all very small, changing network's predictions only a tiny amount). A sparsity ratio of 1.0 means all values are present in the encoded update vector. A sparsity ratio of 0.0 means all values were excluded from the encoded update vector. |
FixedThresholdAlgorithm
A simple fixed threshold algorithm, not adaptive in any way.
|
TargetSparsityThresholdAlgorithm
Targets a specific sparisty throughout training
|
Copyright © 2019. All rights reserved.