lamp-core/lamp/lamp.nn/lamp.nn.languagemodel/LanguageModelModule

LanguageModelModule

lamp.nn.languagemodel.LanguageModelModule

See theLanguageModelModule companion object

case class LanguageModelModule(tokenEmbedding: Embedding, positionEmbedding: Embedding, encoder: TransformerEncoder, finalNorm: LayerNorm) extends GenericModule[LanguageModelInput, LanguageModelOutput]

Transformer based language model module

Initial embedding is the sum of token and position embedding. Token embedding is a learned embedding. Position embedding is also a learned embedding (not sinusoidal etc).

Initial embeddings are fed into layers of transformer blocks. Attention masking is governed by the input similarly as described in chapter 11.3.2.1 in d2l v1.0.0-beta0.

Selected sequence positions in the output of the transformer chain are linearly mapped back into the desired vocabulary size.

Attributes

Companion: object
Graph
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[LanguageModelInput, LanguageModelOutput]

class Object

trait Matchable

class Any
Show all

Members list

Value members

Concrete methods

The implementation of the function.

In addition of x it can also use all the `state to compute its value.

Attributes

List of optimizable, or non-optimizable, but stateful parameters

Stateful means that the state is carried over the repeated forward calls.

Attributes

Inherited methods

Alias of forward

Attributes

Inherited from:: GenericModule

Computes the gradient of loss with respect to the parameters.

Attributes

Inherited from:: GenericModule

Returns the total number of optimizable parameters.

Attributes

Inherited from:: GenericModule

Returns the state variables which need gradient computation.

Attributes

Inherited from:: GenericModule

Attributes

Inherited from:: Product

Attributes

Inherited from:: Product

Attributes

Inherited from:: GenericModule

In this article

Generated with