lamp.nn

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: AdamW.type

Attributes

See also: https://arxiv.org/pdf/1711.05101.pdf Algorithm 2
Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait Optimizer

class Object

trait Matchable

class Any
Show all

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait LossCalculation[Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: BatchNorm.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: BatchNorm2D.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Conv1D.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Conv2D.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Conv2DTransposed.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Debug.type

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait OptimizerHyperparameter

class Object

trait Matchable

class Any
Show all

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Dropout.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[A, B]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: EitherModule.type

Learnable mapping from classes to dense vectors. Equivalent to L * W where L is the n x C one-hot encoded matrix of the classes * is matrix multiplication W is the C x dim dense matrix. W is learnable. L is never computed directly. C is the number of classes. n is the size of the batch.

Input is a long tensor with values in [0,C-1]. Input shape is arbitrary, (). Output shape is ( x D) where D is the embedding dimension.

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Embedding.type

Wraps a (sequence x batch) long -> (sequence x batch x dim) double stateful module and runs in it greedy (argmax) generation mode over timeSteps steps.

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(Variable, T), (Variable, T)]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: FreeRunningRNN.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Fun.type

Inputs of size (sequence length * batch * in dim) Outputs of size (sequence length * batch * hidden dim)

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(Variable, Option[Variable]), (Variable, Option[Variable])]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: GRU.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[A, B]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: GenericFun.type

Attributes

Companion: trait
Supertypes: class Object

trait Matchable

class Any
Self type: GenericModule.type

Base type of modules

Modules are functions of type (Seq[lamp.autograd.Constant],A) => B, where the Seq[lamp.autograd.Constant] arguments are optimizable parameters and A is a non-optimizable input.

Modules provide a way to build composite functions while also keep track of the parameter list of the composite function.

===Example===

case object Weights extends LeafTag
case object Bias extends LeafTag
case class Linear(weights: Constant, bias: Option[Constant]) extends Module {

 override val state = List(
   weights -> Weights
 ) ++ bias.toList.map(b => (b, Bias))

 def forward[S: Sc](x: Variable): Variable = {
   val v = x.mm(weights)
   bias.map(_ + v).getOrElse(v)

 }
}

Some other attributes of modules are attached by type classes e.g. with the nn.TrainingMode, nn.Load type classes.

Type parameters

A: the argument type of the module
B: the value type of the module

Attributes

See also: nn.Module is an alias for simple Variable => Variable modules
Companion: object
Supertypes: class Object

trait Matchable

class Any
Known subtypes: class BertEncoder

class BertLoss

class BertPretrainModule

class MaskedLanguageModelModule

class GraphAttention

class LanguageModelLoss

class LanguageModelModule

class EitherModule[A, B, M1, M2]

class GenericFun[A, B]

class MultiheadAttention

class Recursive[A, M]

class Seq2[T1, T2, T3, M1, M2]

class Seq3[T1, T2, T3, T4, M1, M2, M3]

class Seq4[T1, T2, T3, T4, T5, M1, M2, M3, M4]

class Seq5[T1, T2, T3, T4, T5, T6, M1, M2, M3, M4, M5]

class Seq6[T1, T2, T3, T4, T5, T6, T7, M1, M2, M3, M4, M5, M6]

class Sequential[A, M]

class Transformer

class TransformerDecoder

class TransformerDecoderBlock

class TransformerEmbedding

class TransformerEncoder

class TransformerEncoderBlock

class UnliftedModule[A, B, C, D, M]

class WrapFun[A, B, M, O]
Show all

Type class about how to initialize recurrent neural networks

Attributes

Companion: object
Supertypes: class Object

trait Matchable

class Any

Attributes

Companion: trait
Supertypes: class Object

trait Matchable

class Any
Self type: InitState.type

Attributes

Supertypes: class Object

trait Matchable

class Any

Inputs of size (sequence length * batch * vocab) Outputs of size (sequence length * batch * output dim)

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(Variable, Option[(Variable, Variable)]), (Variable, Option[(Variable, Variable)])]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: LSTM.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: LayerNorm.type

Attributes

Supertypes: trait PTag

class Object

trait Matchable

class Any
Known subtypes: object PositionalEmbeddingWeight.type

object Weights.type

object Bias.type

object RunningMean.type

object RunningVar.type

object Weights.type

object Bias.type

object Weights.type

object Bias.type

object Weights.type

object Bias.type

object Weights.type

object Bias.type

object Weights.type

object Weights.type

object BiasH.type

object BiasR.type

object BiasZ.type

object WeightHh.type

object WeightHr.type

object WeightHz.type

object WeightXh.type

object WeightXr.type

object WeightXz.type

object BiasC.type

object BiasF.type

object BiasI.type

object BiasO.type

object WeightHc.type

object WeightHf.type

object WeightHi.type

object WeightHo.type

object WeightXc.type

object WeightXf.type

object WeightXi.type

object WeightXo.type

object Bias.type

object Scale.type

object Bias.type

object Weights.type

object WeightsK.type

object WeightsO.type

object WeightsQ.type

object WeightsV.type

object NoTag.type

object BiasH.type

object WeightHh.type

object WeightXh.type

object Bias.type

object Weight.type

object Bias1.type

object Bias2.type

object Weights1.type

object Weights2.type

object Embedding.type

object Bias1.type

object Bias2.type

object Scale1.type

object Scale2.type

object Weights1.type

object Weights2.type

object Bias.type

object WeightsG.type

object WeightsV.type
Show all

Attributes

Companion: object
Supertypes: class Object

trait Matchable

class Any

Attributes

Companion: trait
Supertypes: class Object

trait Matchable

class Any
Self type: LearningRateSchedule.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(Variable, Unit), (Variable, Unit)]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: LiftedModule.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Linear.type

Type class about how to load the contents of the state of modules from external tensors

Attributes

Companion: object
Supertypes: class Object

trait Matchable

class Any

Attributes

Companion: trait
Supertypes: class Object

trait Matchable

class Any
Self type: Load.type

Attributes

Supertypes: class Object

trait Matchable

class Any

Loss and Gradient calculation

Takes samples, target, module, loss function and computes the loss and the gradients

Attributes

Supertypes: class Object

trait Matchable

class Any
Known subtypes: class AdversarialTraining

class PerturbedLossCalculation[I]

class SimpleLossCalculation[I]

Attributes

Supertypes: class Object

trait Matchable

class Any
Known subtypes: class BCEWithLogits

object Identity.type

object MSE.type

class NLL

class SequenceNLL

class SmoothL1Loss
Show all

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: LossFunctions.type

Factory for multilayer fully connected feed forward networks

Returned network has the following repeated structure: [linear -> batchnorm -> nonlinearity -> dropout]*

The last block does not include the nonlinearity and the dropout.

Value parameters

dropout: dropout applied to each block
hidden: list of hidden dimensions
in: input dimensions
out: output dimensions

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: MLP.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(A, C), (B, D)]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: MappedState.type

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

Multi-head scaled dot product attention module

Input: (query,key,value,maxLength) where

query: batch x num queries x query dim
key: batch x num k-v x key dim
value: batch x num k-v x key value
maxLength: 1D or 2D long tensor for attention masking

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(Variable, Variable, Variable, Option[STen]), Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: MultiheadAttention.type

Attributes

Supertypes: trait Singleton

trait Product

trait Mirror

trait Serializable

trait Product

trait Equals

trait LeafTag

trait PTag

class Object

trait Matchable

class Any
Show all
Self type: NoTag.type

Attributes

Supertypes: class Object

trait Matchable

class Any
Known subtypes: class AdamW

class RAdam

class SGDW

class Shampoo

class Yogi

Attributes

Supertypes: class Object

trait Matchable

class Any
Known subtypes: class DependentHyperparameter

class simple

A small trait to mark paramters for unique identification

Attributes

Companion: object
Supertypes: class Object

trait Matchable

class Any
Known subtypes: class Tag[T]

trait LeafTag

object PositionalEmbeddingWeight.type

object Weights.type

object Bias.type

object RunningMean.type

object RunningVar.type

object Weights.type

object Bias.type

object Weights.type

object Bias.type

object Weights.type

object Bias.type

object Weights.type

object Bias.type

object Weights.type

object Weights.type

object BiasH.type

object BiasR.type

object BiasZ.type

object WeightHh.type

object WeightHr.type

object WeightHz.type

object WeightXh.type

object WeightXr.type

object WeightXz.type

object BiasC.type

object BiasF.type

object BiasI.type

object BiasO.type

object WeightHc.type

object WeightHf.type

object WeightHi.type

object WeightHo.type

object WeightXc.type

object WeightXf.type

object WeightXi.type

object WeightXo.type

object Bias.type

object Scale.type

object Bias.type

object Weights.type

object WeightsK.type

object WeightsO.type

object WeightsQ.type

object WeightsV.type

object NoTag.type

object BiasH.type

object WeightHh.type

object WeightXh.type

object Bias.type

object Weight.type

object Bias1.type

object Bias2.type

object Weights1.type

object Weights2.type

object Embedding.type

object Bias1.type

object Bias2.type

object Scale1.type

object Scale2.type

object Weights1.type

object Weights2.type

object Bias.type

object WeightsG.type

object WeightsV.type

class Tag[T]
Show all

Attributes

Companion: trait
Supertypes: class Object

trait Matchable

class Any
Self type: PTag.type

Evaluates the gradient at current point + eps where eps is I * N(0,noiseLevel)

Attributes

Supertypes: trait LossCalculation[I]

class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: PositionalEmbedding.type

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: RAdam.type

Rectified Adam optimizer algorithm

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait Optimizer

class Object

trait Matchable

class Any
Show all

Inputs of size (sequence length * batch * in dim) Outputs of size (sequence length * batch * hidden dim)

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(Variable, Option[Variable]), (Variable, Option[Variable])]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: RNN.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[A, A]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Recursive.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: ResidualModule.type

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: SGDW.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait Optimizer

class Object

trait Matchable

class Any
Show all

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[T1, T3]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Seq2.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[((Variable, Variable), S0), (Variable, S1)]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Seq2Seq.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[T1, T4]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Seq3.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[T1, T5]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Seq4.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[T1, T6]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Seq5.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[T1, T7]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Seq6.type

Inputs of size (sequence length * batch * in dim) Outputs of size (sequence length * batch * output dim) Applies a linear function to each time step

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: SeqLinear.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[A, A]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Sequential.type

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Shampoo.type

Attributes

See also: https://arxiv.org/pdf/1802.09568.pdf Algorithm 1
Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait Optimizer

class Object

trait Matchable

class Any
Show all

Attributes

Supertypes: trait LossCalculation[I]

class Object

trait Matchable

class Any

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(T1, (S1, S2)), (T3, (S1, S2))]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: StatefulSeq2.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(T1, (S1, S2, S3)), (T4, (S1, S2, S3))]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: StatefulSeq3.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(T1, (S1, S2, S3, S4)), (T5, (S1, S2, S3, S4))]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: StatefulSeq4.type

case class StatefulSeq5[T1, T2, T3, T4, T5, T6, S1, S2, S3, S4, S5, M1 <: StatefulModule[T1, T2, S1], M2 <: StatefulModule[T2, T3, S2], M3 <: StatefulModule[T3, T4, S3], M4 <: StatefulModule[T4, T5, S4], M5 <: StatefulModule[T5, T6, S5]](m1: M1 & StatefulModule[T1, T2, S1], m2: M2 & StatefulModule[T2, T3, S2], m3: M3 & StatefulModule[T3, T4, S3], m4: M4 & StatefulModule[T4, T5, S4], m5: M5 & StatefulModule[T5, T6, S5]) extends StatefulModule[T1, T6, (S1, S2, S3, S4, S5)]

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(T1, (S1, S2, S3, S4, S5)), (T6, (S1, S2, S3, S4, S5))]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: StatefulSeq5.type

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Type class about how to switch a module into training or evaluation mode

Attributes

Companion: object
Supertypes: class Object

trait Matchable

class Any

Attributes

Companion: trait
Supertypes: class Object

trait Matchable

class Any
Self type: TrainingMode.type

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(Variable, Variable, Option[STen], Option[STen]), Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Transformer.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(Variable, Variable, Option[STen]), Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: TransformerDecoder.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(Variable, Variable, Option[STen]), Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: TransformerDecoderBlock.type

A module with positional and token embeddings

Token embeddings are lookup embeddings. Positional embeddings are supplied as a constant. They are supposed to come from a fixed unlearned derivation of the positions.

Token and positional embeddings are summed.

Gradients are not computed for positionalEmbedding

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: TransformerEmbedding.type

TransformerEncoder module

Does not include initial embedding or position encoding.

Input is (data, maxLength) where data is (batch, sequence, input dimension), double tensor maxLength is a 1D or 2D long tensor used for attention masking.

Attention masking is implemented similarly to chapter 11.3.2.1 in d2l.ai v1.0.0-beta0. It supports unmasked attention, attention on variable length input, and left-to-right attention.

Output is (bach, sequence, output dimension)

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(Variable, Option[STen]), Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: TransformerEncoder.type

A single block of the transformer self attention encoder using GELU

Input is (data, maxLength) where data is (batch, sequence, input dimension), double tensor maxLength is a 1D or 2D long tensor used for attention masking.

The order of operations depends on gptOrder param. If gptOrder is true then:

y = attention(norm(input))+input
result = mlp(norm(y))+y
Note that in this case there is no normalization at the end of the transformer. One may wants to add one separately. This is how GPT2 is defined in hugging face or nanoGPT.
Note that the residual connection has a path which does not flow through the normalization.
- dimension wise learnable scale parameter in each residual path

If gptOrder is false then:

y = norm(attention(input)+input )
result = norm(mlp(y)+y)
This follows chapter 11.7 in d2l.ai v1.0.0-beta0. (Same as in https://arxiv.org/pdf/1706.03762.pdf)
Note that the residual connection has a path which flows through the normalization.

Output is (bach, sequence, output dimension)

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(Variable, Option[STen]), Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: TransformerEncoderBlock.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[A, B]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: UnliftedModule.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[Variable, Variable]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: WeightNormLinear.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[(A, C), (B, C)]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: WithInit.type

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait GenericModule[A, (B, O)]

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: WrapFun.type

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: Yogi.type

The Yogi optimizer algorithm I added the decoupled weight decay term following https://arxiv.org/pdf/1711.05101.pdf

Attributes

See also: https://papers.nips.cc/paper/2018/file/90365351ccc7437a1309dc64e4db32a3-Paper.pdf Algorithm 2
Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

trait Optimizer

class Object

trait Matchable

class Any
Show all

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: sequence.type

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait OptimizerHyperparameter

class Object

trait Matchable

class Any
Show all

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: statefulSequence.type

lamp.nn

Attributes

Members list

Packages

Type members

Classlikes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Type parameters

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Value parameters

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes