- Companion:
- class
Type members
Classlikes
Value members
Concrete methods
Factory for the encoder module of Bert
Factory for the encoder module of Bert
Input is (tokens, segments)
where tokens
and segments
are both
(batch,num tokens) long tensor.
- Value parameters:
- attentionHiddenPerHeadDim
size of hidden attention dimension of each attention head
- attentionNumHeads
number of attention heads
- dropout
dropout rate
- embeddingDim
input embedding dimension
- maxLength
maximum num token length
- mlpHiddenDim
size of hidden dimension of the two layer perceptron
- numBlocks
number of transformer blocks to create
- out
output dimension
- padToken
pad token, (batch, seq) positions where
tokens
==padToken
are ignored, padding is not the same as masking- positionEmbedding
optional float tensor of size (sequence length, embedding dimension) if missing the absolute positional embeddings from Vaswani et al 2017 is used Following the Bert paper the position embeddings are summed
- tOpt
tensor options
- vocabularySize
vocabulary size
- Returns:
a module