LanguageModelLoss

lamp.nn.languagemodel.LanguageModelLoss$

See theLanguageModelLoss companion class

Attributes

Companion: class
Graph
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: LanguageModelLoss.type

The names of the product elements

The name of the type

Allocate language model module with negative log likelihood loss

attentionHiddenPerHeadDim: Per head hidden dimension in the multihead attention
attentionNumHeads: Number of attention heads in the multihead attention
embeddingDim: Width of the initial embedding dimension, as well as the output width of each transformer block
encoderMlpHiddenDim: Hidden dimension within transformer blocks
linearized: Whether to use linearized self attention
maxLength: Total sequence length including padding if used. Sometimes called block length or context length.
numBlocks: Number of transformer blocks (layers).
padToken: This token is ignored during loss computation. Not used otherwise.
tOpt: TensorOption to set device and data type
vocabularySize: Total vocabulary size.

In this article

Generated with