Class/Object

org.platanios.tensorflow.api.ops.training

ExponentialMovingAverage

Related Docs: object ExponentialMovingAverage | package training

Permalink

class ExponentialMovingAverage extends AnyRef

Maintains moving averages of variables by employing an exponential decay.

When training a model, it is often beneficial to maintain moving averages of the trained parameters. Evaluations that use averaged parameters sometimes produce significantly better results than the final trained values.

The computeForVariables(...) and computeForValues(...) methods add shadow copies of the provided variables and values, along with ops that maintain their moving averages, in their shadow copies. They are used when building the training model. The ops that maintain moving averages are typically run after each training step. The average(...) and averageName(...) methods provide access to the shadow variables and their names. They are useful when building an evaluation model, or when restoring a model from a checkpoint file. They help use the moving averages in place of the last trained values for evaluations.

The moving averages are computed using exponential decay. The decay value must be provided when creating an ExponentialMovingAverage object. The shadow variables are initialized with the same initial values as the corresponding variables, or with zeros for the case of values. When the ops used to maintain the moving averages are executed, each shadow variable is updated using the formula:

shadowVariable -= (1 - decay) * (shadowVariable - value)

This is mathematically equivalent to the classic formula below, but the use of an assignSub op (the -= in the formula) allows concurrent lock-free updates to the variables:

shadowVariable = decay * shadow_variable + (1 - decay) * value

Reasonable values for decay are close to 1.0f, typically in the "multiple-nines" range: 0.999f, etc.

Example usage when creating a training model:

// Create variables
val v0 = tf.variable(...)
val v1 = tf.variable(...)
// Use the variables to build a training model
...
// Create an op that applies the optimizer. This is what we usually would use as a training op.
val optOp = opt.minimize(loss, variables = Set(v0, v1))

// Create an exponential moving average object.
val ema = tf.train.ExponentialMovingAverage(decay = 0.999f)

val trainOp = tf.createWith(controlDependencies = Set(optOp)) {
  // Create the shadow variables, and add ops used to maintain the moving averages of `v0` and `v1`. This also
  // creates an op that will update the moving averages after each training step. This is what we will use in
  // place of the usual training op.
  ema.computeForVariables(Set(v0, v1))
}

// Train the model by running `trainOp`.

There are two ways to use moving averages for evaluations:

Example of restoring the shadow variable values:

// Create a saver that loads variables from their saved shadow values.
val shadowV0Name = ema.averageName(v0)
val shadowV1Name = ema.averageName(v1)
val saver = tf.saver(Map(shadowV0Name -> v0, shadowV1Name -> v1))
saver.restore(...checkpoint filename...)
// `v0` and `v1` now hold the moving average values.

The optional numUpdates parameter allows one to tweak the decay rate dynamically. It is typical to pass the count of training steps, usually kept in a variable that is incremented at each step, in which case the decay rate is lower at the start of training. This makes moving averages move faster. If passed, the actual decay rate used is defined as: min(decay, (1 + numUpdates) / (10 + numUpdates)).

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ExponentialMovingAverage
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ExponentialMovingAverage(decay: Float, numUpdates: Option[Int] = None, zeroDebias: Boolean = false, name: String = "ExponentialMovingAverage")

    Permalink

    decay

    Decay value to use.

    numUpdates

    Optional count of number of updates applied to the variables.

    zeroDebias

    If true, the moving averages computed for values provided in computeForValues will be zero-debiased.

    name

    Name prefix to use for all created ops.

    Attributes
    protected

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def average(value: Output): Option[variables.Variable]

    Permalink

    Returns the variable holding the average for value.

  6. def average(variable: variables.Variable): Option[variables.Variable]

    Permalink

    Returns the variable holding the average for variable.

  7. def averageName(value: Output): Option[String]

    Permalink

    Returns the name of the variable holding the average for value.

  8. def averageName(variable: variables.Variable): Option[String]

    Permalink

    Returns the name of the variable holding the average for variable.

  9. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  10. def computeForValues(values: Set[Output]): Op

    Permalink

    Computes moving averages of the provided values.

    Computes moving averages of the provided values.

    This method creates shadow variables for all elements of values. The shadow variables for each value are created with trainable = false, initialized to 0 and optionally zero-debiased, and added to the Graph.Keys.MOVING_AVERAGE_VARIABLES and the Graph.Keys.GLOBAL_VARIABLES collections.

    values

    Values for which to compute moving averages.

    returns

    Created op that updates all the shadow variables, as described above.

  11. def computeForVariables(variables: Set[variables.Variable] = Op.currentGraph.trainableVariables): Op

    Permalink

    Computes moving averages of the provided variables.

    Computes moving averages of the provided variables.

    This method creates shadow variables for all elements of variables. The shadow variables for each variable are created with trainable = false, initialized to the variable's initial value, and added to the Graph.Keys.MOVING_AVERAGE_VARIABLES and the Graph.Keys.GLOBAL_VARIABLES collections.

    variables

    Variables for which to compute moving averages.

    returns

    Created op that updates all the shadow variables, as described above.

  12. val decay: Float

    Permalink

    Decay value to use.

  13. val decayTensor: Output

    Permalink
    Attributes
    protected
  14. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  15. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  16. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  17. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  18. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  19. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  20. val name: String

    Permalink

    Name prefix to use for all created ops.

  21. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  22. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  23. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  24. val numUpdates: Option[Int]

    Permalink

    Optional count of number of updates applied to the variables.

  25. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  26. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  27. val valueAverages: Map[Output, variables.Variable]

    Permalink
    Attributes
    protected
  28. val variableAverages: Map[variables.Variable, variables.Variable]

    Permalink
    Attributes
    protected
  29. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  30. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. val zeroDebias: Boolean

    Permalink

    If true, the moving averages computed for values provided in computeForValues will be zero-debiased.

Inherited from AnyRef

Inherited from Any

Ungrouped