TransformerEncoder

lamp.nn.TransformerEncoder
See theTransformerEncoder companion object
case class TransformerEncoder(blocks: Seq[TransformerEncoderBlock]) extends GenericModule[(Variable, Option[STen]), Variable]

TransformerEncoder module

Does not include initial embedding or position encoding.

Input is (data, maxLength) where data is (batch, sequence, input dimension), double tensor maxLength is a 1D or 2D long tensor used for attention masking.

Attention masking is implemented similarly to chapter 11.3.2.1 in d2l.ai v1.0.0-beta0. It supports unmasked attention, attention on variable length input, and left-to-right attention.

Output is (bach, sequence, output dimension)

Attributes

Companion
object
Graph
Supertypes
trait Serializable
trait Product
trait Equals
trait GenericModule[(Variable, Option[STen]), Variable]
class Object
trait Matchable
class Any
Show all

Members list

Value members

Concrete methods

def forward[S : Sc](x: (Variable, Option[STen])): Variable

The implementation of the function.

The implementation of the function.

In addition of x it can also use all the `state to compute its value.

Attributes

def state: Seq[(Constant, PTag)]

List of optimizable, or non-optimizable, but stateful parameters

List of optimizable, or non-optimizable, but stateful parameters

Stateful means that the state is carried over the repeated forward calls.

Attributes

Inherited methods

def apply[S : Sc](a: (Variable, Option[STen])): B

Alias of forward

Alias of forward

Attributes

Inherited from:
GenericModule
final def gradients(loss: Variable, zeroGrad: Boolean): Seq[Option[STen]]

Computes the gradient of loss with respect to the parameters.

Computes the gradient of loss with respect to the parameters.

Attributes

Inherited from:
GenericModule
final def learnableParameters: Long

Returns the total number of optimizable parameters.

Returns the total number of optimizable parameters.

Attributes

Inherited from:
GenericModule
final def parameters: Seq[(Constant, PTag)]

Returns the state variables which need gradient computation.

Returns the state variables which need gradient computation.

Attributes

Inherited from:
GenericModule
def productElementNames: Iterator[String]

Attributes

Inherited from:
Product
def productIterator: Iterator[Any]

Attributes

Inherited from:
Product
final def zeroGrad(): Unit

Attributes

Inherited from:
GenericModule