lamp.nn.languagemodel
Members list
Type members
Classlikes
Input to language model
Input to language model
Value parameters
- maxLength
-
batch x sequence OR batch, see maskedSoftmax. Used to define masking of the attention matrix. Use cases:
- Left-to-right (causal) attention with uniform sequence length. In this case use a batch x sequence 2D matrix with arange(0,sequence) in each row.
- Variable length sequences with bidirectional attention. In this case use a 1D [batch] vector with the real length of each sequence (rest are padded).
- If empty the attention matrix is not masked
- positions
-
batch x sequence, type long in [0,sequence], selects positions. Final LM logits are computed on the selected positions. If empty then selects all positions.
- tokens
-
batch x sequence, type long
Attributes
- Companion
- object
- Supertypes
-
trait Serializabletrait Producttrait Equalsclass Objecttrait Matchableclass AnyShow all
Attributes
- Companion
- class
- Supertypes
-
trait Producttrait Mirrorclass Objecttrait Matchableclass Any
- Self type
-
LanguageModelInput.type
Module with the language model and a loss
Module with the language model and a loss
Main trainig entry point of the language model
Attributes
- Companion
- object
- Supertypes
-
trait Serializabletrait Producttrait Equalsclass Objecttrait Matchableclass AnyShow all
Attributes
- Companion
- class
- Supertypes
-
trait Producttrait Mirrorclass Objecttrait Matchableclass Any
- Self type
-
LanguageModelLoss.type
Transformer based language model module
Transformer based language model module
Initial embedding is the sum of token and position embedding. Token embedding is a learned embedding. Position embedding is also a learned embedding (not sinusoidal etc).
Initial embeddings are fed into layers of transformer blocks. Attention masking is governed by the input similarly as described in chapter 11.3.2.1 in d2l v1.0.0-beta0.
Selected sequence positions in the output of the transformer chain are linearly mapped back into the desired vocabulary size.
Attributes
- Companion
- object
- Supertypes
-
trait Serializabletrait Producttrait Equalsclass Objecttrait Matchableclass AnyShow all
Attributes
- Companion
- class
- Supertypes
-
trait Producttrait Mirrorclass Objecttrait Matchableclass Any
- Self type
-
LanguageModelModule.type
Output of LM
Output of LM
Value parameters
- encoded
-
encoded: float tensor of size (batch, sequence length, embedding dimension) holds per token embeddings
- languageModelLogits
-
float tensor of size (batch, sequence length, vocabulary size) holds per token logits. Use logSoftMax(dim=2) to get log probabilities.
Attributes
- Supertypes
-
trait Serializabletrait Producttrait Equalsclass Objecttrait Matchableclass AnyShow all
Attributes
- Companion
- object
- Supertypes
-
trait Serializabletrait Producttrait Equalsclass Objecttrait Matchableclass AnyShow all
Attributes
- Companion
- class
- Supertypes
-
trait Producttrait Mirrorclass Objecttrait Matchableclass Any
- Self type
Language model input and target for loss calculation
Language model input and target for loss calculation
Value parameters
- languageModelTarget
-
batch x sequence
Attributes
- Companion
- object
- Supertypes
-
trait Serializabletrait Producttrait Equalsclass Objecttrait Matchableclass AnyShow all