org.platanios.tensorflow.api.ops.rnn.attention
Memory to query; usually the output of an RNN encoder. Each tensor in the memory
should be shaped [batchSize, maxTime, ...]
.
Weights tensor with which the memory is multiplied to produce the attention keys.
Weights tensor with which the query is multiplied to produce the attention query.
Weights tensor with which the score components are multiplied before being summed.
Sequence lengths for the batch entries in the memory. If provided, the memory tensor rows are masked with zeros for values past the respective sequence lengths.
Scalar tensor used to normalize the alignment score energy term; usually a trainable
variable initialized to sqrt((1 / numUnits))
.
Vector bias added to the alignment scores prior to applying the non-linearity; usually a variable initialized to zeros.
Optional function that converts computed scores to probabilities. Defaults to the softmax function. A potentially useful alternative is the hardmax function.
Mask value to use for the score before passing it to probabilityFn
. Defaults to
negative infinity. Note that this value is only used if memorySequenceLengths
is not
null
.
Name prefix to use for all created ops.
Computes an alignment tensor given the provided query and previous alignment tensor.
Computes an alignment tensor given the provided query and previous alignment tensor.
The previous alignment tensor is important for attention mechanisms that use the previous alignment to calculate the attention at the next time step, such as monotonic attention mechanisms.
TODO: Figure out how to generalize the "next state" functionality.
Query tensor.
Previous alignment tensor.
Tuple containing the alignment tensor and the next attention state.
Initial alignment value.
Initial alignment value.
This is important for attention mechanisms that use the previous alignment to calculate the alignment at the next time step (e.g., monotonic attention).
The default behavior is to return a tensor of all zeros.
Initial state value.
Initial state value.
This is important for attention mechanisms that use the previous alignment to calculate the alignment at the next time step (e.g., monotonic attention).
The default behavior is to return the same output as initialAlignment
.
Memory to query; usually the output of an RNN encoder.
Memory to query; usually the output of an RNN encoder. Each tensor in the memory
should be shaped [batchSize, maxTime, ...]
.
Sequence lengths for the batch entries in the memory.
Sequence lengths for the batch entries in the memory. If provided, the memory tensor rows are masked with zeros for values past the respective sequence lengths.
Weights tensor with which the memory is multiplied to produce the attention keys.
Weights tensor with which the memory is multiplied to produce the attention keys.
Name prefix to use for all created ops.
Name prefix to use for all created ops.
Vector bias added to the alignment scores prior to applying the non-linearity; usually a variable initialized to zeros.
Vector bias added to the alignment scores prior to applying the non-linearity; usually a variable initialized to zeros.
Scalar tensor used to normalize the alignment score energy term; usually a trainable
variable initialized to sqrt((1 / numUnits))
.
Scalar tensor used to normalize the alignment score energy term; usually a trainable
variable initialized to sqrt((1 / numUnits))
.
Computes alignment probabilities for score
.
Computes alignment probabilities for score
.
Alignment score tensor.
Alignment probabilities tensor.
Optional function that converts computed scores to probabilities.
Optional function that converts computed scores to probabilities. Defaults to the softmax function. A potentially useful alternative is the hardmax function.
Weights tensor with which the query is multiplied to produce the attention query.
Weights tensor with which the query is multiplied to produce the attention query.
Computes an alignment score for query
.
Computes an alignment score for query
.
Query tensor.
Score tensor.
Mask value to use for the score before passing it to probabilityFn
.
Mask value to use for the score before passing it to probabilityFn
. Defaults to
negative infinity. Note that this value is only used if memorySequenceLengths
is not
null
.
Weights tensor with which the score components are multiplied before being summed.
Weights tensor with which the score components are multiplied before being summed.
Bahdanau-style (multiplicative) attention scoring.
This attention has two forms. The first is standard Luong attention, as described in: ["Effective Approaches to Attention-based Neural Machine Translation.", EMNLP 2015](https://arxiv.org/abs/1508.04025).
The second is the scaled form inspired partly by the normalized form of Bahdanau attention. To enable the second form, construct the object with
weightsScale
set to the value of a scalar scaling variable.This attention has two forms. The first is Bahdanau attention, as described in: ["Neural Machine Translation by Jointly Learning to Align and Translate.", ICLR 2015](https://arxiv.org/abs/1409.0473).
The second is a normalized form inspired by the weight normalization method described in: ["Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks.", NIPS 2016](https://arxiv.org/abs/1602.07868).