TransformerEncoderBlock

case class TransformerEncoderBlock(attention: MultiheadAttention, layerNorm1: LayerNorm, layerNorm2: LayerNorm, w1: Constant, b1: Constant, w2: Constant, b2: Constant, dropout: Double, train: Boolean) extends GenericModule[(Variable, STen), Variable]

A single block of the transformer encoder as defined in Fig 10.7.1 in d2l v0.16

Companion:
object
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any

Value members

Concrete methods

def forward[S : Sc](x: (Variable, STen)): Variable
def state: Seq[(Constant, PTag)]

Inherited methods

def apply[S : Sc](a: (Variable, STen)): Variable

Alias of forward

Alias of forward

Inherited from:
GenericModule
final def gradients(loss: Variable, zeroGrad: Boolean): Seq[Option[STen]]

Computes the gradient of loss with respect to the parameters.

Computes the gradient of loss with respect to the parameters.

Inherited from:
GenericModule
final def learnableParameters: Long

Returns the total number of optimizable parameters.

Returns the total number of optimizable parameters.

Inherited from:
GenericModule
final def parameters: Seq[(Constant, PTag)]

Returns the state variables which need gradient computation.

Returns the state variables which need gradient computation.

Inherited from:
GenericModule
def productElementNames: Iterator[String]
Inherited from:
Product
def productIterator: Iterator[Any]
Inherited from:
Product
final def zeroGrad(): Unit
Inherited from:
GenericModule