lamp-core/lamp/lamp.nn/TransformerEncoder

TransformerEncoder

lamp.nn.TransformerEncoder

See theTransformerEncoder companion object

case class TransformerEncoder(blocks: Seq[TransformerEncoderBlock]) extends GenericModule[(Variable, Option[STen]), Variable]

TransformerEncoder module

Does not include initial embedding or position encoding.

Input is (data, maxLength) where data is (batch, sequence, input dimension), double tensor maxLength is a 1D or 2D long tensor used for attention masking.

Attention masking is implemented similarly to chapter 11.3.2.1 in d2l.ai v1.0.0-beta0. It supports unmasked attention, attention on variable length input, and left-to-right attention.

Output is (bach, sequence, output dimension)

Attributes

Companion: object
Graph
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(Variable, Option[STen]), Variable]

class Object

trait Matchable

class Any
Show all

Members list

Value members

Concrete methods

The implementation of the function.

In addition of x it can also use all the `state to compute its value.

Attributes

List of optimizable, or non-optimizable, but stateful parameters

Stateful means that the state is carried over the repeated forward calls.

Attributes

Inherited methods

Alias of forward

Attributes

Inherited from:: GenericModule

Computes the gradient of loss with respect to the parameters.

Attributes

Inherited from:: GenericModule

Returns the total number of optimizable parameters.

Attributes

Inherited from:: GenericModule

Returns the state variables which need gradient computation.

Attributes

Inherited from:: GenericModule

Attributes

Inherited from:: Product

Attributes

Inherited from:: Product

Attributes

Inherited from:: GenericModule

In this article

Generated with