case class Seq2SeqWithAttention[S0, S1, M0 <: Module, M1 <: StatefulModule2[Variable, Variable, S0, S1], M2 <: StatefulModule[Variable, Variable, S1]](destinationEmbedding: M0 & Module, encoder: M1 & StatefulModule2[Variable, Variable, S0, S1], decoder: M2 & StatefulModule[Variable, Variable, S1], padToken: Long)(stateToKey: S1 => Variable) extends StatefulModule2[(Variable, Variable), Variable, S0, S1]
- Companion:
- object
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Value members
Concrete methods
def attentionDecoder(keyValue: Variable, source: Variable): AttentionDecoder[S1, M2 & StatefulModule[Variable, Variable, S1], M0 & Module]
Inherited methods
Computes the gradient of loss with respect to the parameters.
Computes the gradient of loss with respect to the parameters.
- Inherited from:
- GenericModule
Returns the total number of optimizable parameters.
Returns the total number of optimizable parameters.
- Inherited from:
- GenericModule
Returns the state variables which need gradient computation.
Returns the state variables which need gradient computation.
- Inherited from:
- GenericModule