nn

package nn

Provides building blocks for neural networks

Notable types:

nn.GenericModule is an abstraction on parametric functions
nn.Optimizer is an abstraction of gradient based optimizers
nn.LossFunction is an abstraction of loss functions, see the companion object for the implemented losses
nn.SupervisedModel combines a module with a loss

Optimizers:

Modules facilitating composing other modules:

nn.Sequential composes a homogenous list of modules (analogous to List)
nn.sequence composes a heterogeneous list of modules (analogous to tuples)
nn.EitherModule composes two modules in a scala.Either

Examples of neural network building blocks, layers etc:

nn.Linear implements W X + b with parameters W and b and input X
nn.BatchNorm, nn.LayerNorm implement batch and layer normalization
nn.MLP is a factory of a multilayer perceptron architecture

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

nn
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Type Members

case class AdamW(parameters: Seq[(STen, PTag)], weightDecay: OptimizerHyperparameter, learningRate: OptimizerHyperparameter = simple(0.001), beta1: OptimizerHyperparameter = simple(0.9), beta2: OptimizerHyperparameter = simple(0.999), eps: Double = 1e-8, clip: Option[Double] = None, debias: Boolean = true) extends Optimizer with Product with Serializable

See also
https://arxiv.org/pdf/1711.05101.pdf Algorithm 2
case class AdversarialTraining(eps: Double) extends LossCalculation[Variable] with Product with Serializable
case class AttentionDecoder[T, M <: StatefulModule[Variable, Variable, T], M0 <: Module](decoder: M with StatefulModule[Variable, Variable, T], embedding: M0 with Module, stateToKey: (T) ⇒ Variable, keyValue: Variable, tokens: Variable, padToken: Long) extends StatefulModule[Variable, Variable, T] with Product with Serializable
case class BatchNorm(weight: Constant, bias: Constant, runningMean: Constant, runningVar: Constant, training: Boolean, momentum: Double, eps: Double) extends Module with Product with Serializable
case class BatchNorm2D(weight: Constant, bias: Constant, runningMean: Constant, runningVar: Constant, training: Boolean, momentum: Double, eps: Double) extends Module with Product with Serializable
case class Conv1D(weights: Constant, bias: Constant, stride: Long, padding: Long, dilation: Long, groups: Long) extends Module with Product with Serializable
case class Conv2D(weights: Constant, bias: Constant, stride: Long, padding: Long, dilation: Long, groups: Long) extends Module with Product with Serializable
case class Conv2DTransposed(weights: Constant, bias: Constant, stride: Long, padding: Long, dilation: Long) extends Module with Product with Serializable
case class DependentHyperparameter(default: Double)(pf: PartialFunction[PTag, Double]) extends OptimizerHyperparameter with Product with Serializable
case class Dropout(prob: Double, training: Boolean) extends Module with Product with Serializable
case class EitherModule[A, B, M1 <: GenericModule[A, B], M2 <: GenericModule[A, B]](members: Either[M1 with GenericModule[A, B], M2 with GenericModule[A, B]]) extends GenericModule[A, B] with Product with Serializable
case class Embedding(weights: Constant) extends Module with Product with Serializable
Learnable mapping from classes to dense vectors.
Learnable mapping from classes to dense vectors. Equivalent to L * W where L is the n x C one-hot encoded matrix of the classes * is matrix multiplication W is the C x dim dense matrix. W is learnable. L is never computed directly. C is the number of classes. n is the size of the batch.
Input is a long tensor with values in [0,C-1]. Input shape is arbitrary, (*). Output shape is (* x D) where D is the embedding dimension.
case class FreeRunningRNN[T, M <: StatefulModule[Variable, Variable, T]](module: M with StatefulModule[Variable, Variable, T], timeSteps: Int) extends StatefulModule[Variable, Variable, T] with Product with Serializable
Wraps a (sequence x batch) long -> (sequence x batch x dim) double stateful module and runs in it greedy (argmax) generation mode over timeSteps steps.
case class Fun(fun: (Scope) ⇒ (Variable) ⇒ Variable) extends Module with Product with Serializable
case class GCN[M <: Module](transform: M with Module) extends GenericModule[(Variable, Variable), (Variable, Variable)] with Product with Serializable
case class GRU(weightXh: Constant, weightHh: Constant, weightXr: Constant, weightXz: Constant, weightHr: Constant, weightHz: Constant, biasR: Constant, biasZ: Constant, biasH: Constant) extends StatefulModule[Variable, Variable, Option[Variable]] with Product with Serializable
Inputs of size (sequence length * batch * in dim) Outputs of size (sequence length * batch * hidden dim)
case class GenericFun[A, B](fun: (Scope) ⇒ (A) ⇒ B) extends GenericModule[A, B] with Product with Serializable
trait GenericModule[A, B] extends AnyRef
Base type of modules
Base type of modules
Modules are functions of type (Seq[lamp.autograd.Constant],A) => B, where the Seq[lamp.autograd.Constant] arguments are optimizable parameters and A is a non-optimizable input.
Modules provide a way to build composite functions while also keep track of the parameter list of the composite function.
Example
```
case object Weights extends LeafTag
case object Bias extends LeafTag
case class Linear(weights: Constant, bias: Option[Constant]) extends Module {

 override val state = List(
   weights -> Weights
 ) ++ bias.toList.map(b => (b, Bias))

 def forward[S: Sc](x: Variable): Variable = {
   val v = x.mm(weights)
   bias.map(_ + v).getOrElse(v)

 }
}
```
Some other attributes of modules are attached by type classes e.g. with the nn.TrainingMode, nn.Load type classes.
A
the argument type of the module
B
the value type of the module

See also
nn.Module is an alias for simple Variable => Variable modules
type GraphModule = GenericModule[(Variable, Variable), (Variable, Variable)]
case class GraphReadout[M <: GraphModule](m: M with GraphModule, pooling: PoolType) extends GenericModule[(Variable, Variable, Variable), Variable] with Product with Serializable
trait InitState[M, C] extends AnyRef
Type class about how to initialize recurrent neural networks
implicit class InitStateSyntax[M, C] extends AnyRef
case class LSTM(weightXi: Constant, weightXf: Constant, weightXo: Constant, weightHi: Constant, weightHf: Constant, weightHo: Constant, weightXc: Constant, weightHc: Constant, biasI: Constant, biasF: Constant, biasO: Constant, biasC: Constant) extends StatefulModule[Variable, Variable, Option[(Variable, Variable)]] with Product with Serializable
Inputs of size (sequence length * batch * vocab) Outputs of size (sequence length * batch * output dim)
case class LayerNorm(scale: Constant, bias: Constant, eps: Double, normalizedDim: List[Int]) extends Module with Product with Serializable
trait LeafTag extends PTag
trait LearningRateSchedule[State] extends AnyRef
case class LiftedModule[M <: Module](mod: M with Module) extends StatefulModule[Variable, Variable, Unit] with Product with Serializable
case class Linear(weights: Constant, bias: Option[Constant]) extends Module with Product with Serializable
trait Load[M] extends AnyRef
Type class about how to load the contents of the state of modules from external tensors
implicit class LoadSyntax[M] extends AnyRef
trait LossCalculation[I] extends AnyRef
trait LossFunction extends AnyRef
case class MappedState[A, B, C, D, M <: StatefulModule[A, B, C]](statefulModule: M with StatefulModule[A, B, C], map: (C) ⇒ D) extends StatefulModule2[A, B, C, D] with Product with Serializable
case class ModelWithOptimizer[I, M <: GenericModule[I, Variable]](model: SupervisedModel[I, M], optimizer: Optimizer) extends Product with Serializable
type Module = GenericModule[Variable, Variable]
case class MultiheadAttention(wQ: Constant, wK: Constant, wV: Constant, wO: Constant, dropout: Double, train: Boolean, numHeads: Int, padToken: Long, linearized: Boolean) extends GenericModule[(Variable, Variable, Variable, STen), Variable] with Product with Serializable
Multi-head scaled dot product attention module
Multi-head scaled dot product attention module
Input: (query,key,value,tokens) where query: batch x num queries x query dim key: batch x num k-v x key dim value: batch x num k-v x key value tokens: batch x num queries, long type
Tokens is used to carry over padding information and ignore the padding
case class NGCN[M <: Module](transforms: Seq[M with Module], weightFc: Constant, K: Int, includeZeroOrder: Boolean) extends GenericModule[(Variable, Variable), (Variable, Variable)] with Product with Serializable
trait Optimizer extends AnyRef
trait OptimizerHyperparameter extends AnyRef
trait PTag extends AnyRef
A small trait to mark paramters for unique identification
case class Passthrough[M <: Module](m: M with Module) extends GenericModule[(Variable, Variable), (Variable, Variable)] with Product with Serializable
case class RAdam(parameters: Seq[(STen, PTag)], weightDecay: OptimizerHyperparameter, learningRate: OptimizerHyperparameter = simple(0.001), beta1: OptimizerHyperparameter = simple(0.9), beta2: OptimizerHyperparameter = simple(0.999), eps: Double = 1e-8, clip: Option[Double] = None) extends Optimizer with Product with Serializable
Rectified Adam optimizer algorithm
case class RNN(weightXh: Constant, weightHh: Constant, biasH: Constant) extends StatefulModule[Variable, Variable, Option[Variable]] with Product with Serializable
Inputs of size (sequence length * batch * in dim) Outputs of size (sequence length * batch * hidden dim)
case class ResidualModule[M <: Module](transform: M with Module) extends Module with Product with Serializable
case class SGDW(parameters: Seq[(STen, PTag)], learningRate: OptimizerHyperparameter, weightDecay: OptimizerHyperparameter, momentum: Option[OptimizerHyperparameter] = None, clip: Option[Double] = None) extends Optimizer with Product with Serializable
case class Seq2[T1, T2, T3, M1 <: GenericModule[T1, T2], M2 <: GenericModule[T2, T3]](m1: M1 with GenericModule[T1, T2], m2: M2 with GenericModule[T2, T3]) extends GenericModule[T1, T3] with Product with Serializable
case class Seq2Seq[S0, S1, M1 <: StatefulModule2[Variable, Variable, S0, S1], M2 <: StatefulModule[Variable, Variable, S1]](encoder: M1 with StatefulModule2[Variable, Variable, S0, S1], decoder: M2 with StatefulModule[Variable, Variable, S1]) extends StatefulModule2[(Variable, Variable), Variable, S0, S1] with Product with Serializable
case class Seq2SeqWithAttention[S0, S1, M0 <: Module, M1 <: StatefulModule2[Variable, Variable, S0, S1], M2 <: StatefulModule[Variable, Variable, S1]](destinationEmbedding: M0 with Module, encoder: M1 with StatefulModule2[Variable, Variable, S0, S1], decoder: M2 with StatefulModule[Variable, Variable, S1], padToken: Long)(stateToKey: (S1) ⇒ Variable) extends StatefulModule2[(Variable, Variable), Variable, S0, S1] with Product with Serializable
case class Seq3[T1, T2, T3, T4, M1 <: GenericModule[T1, T2], M2 <: GenericModule[T2, T3], M3 <: GenericModule[T3, T4]](m1: M1 with GenericModule[T1, T2], m2: M2 with GenericModule[T2, T3], m3: M3 with GenericModule[T3, T4]) extends GenericModule[T1, T4] with Product with Serializable
case class Seq4[T1, T2, T3, T4, T5, M1 <: GenericModule[T1, T2], M2 <: GenericModule[T2, T3], M3 <: GenericModule[T3, T4], M4 <: GenericModule[T4, T5]](m1: M1 with GenericModule[T1, T2], m2: M2 with GenericModule[T2, T3], m3: M3 with GenericModule[T3, T4], m4: M4 with GenericModule[T4, T5]) extends GenericModule[T1, T5] with Product with Serializable
case class Seq5[T1, T2, T3, T4, T5, T6, M1 <: GenericModule[T1, T2], M2 <: GenericModule[T2, T3], M3 <: GenericModule[T3, T4], M4 <: GenericModule[T4, T5], M5 <: GenericModule[T5, T6]](m1: M1 with GenericModule[T1, T2], m2: M2 with GenericModule[T2, T3], m3: M3 with GenericModule[T3, T4], m4: M4 with GenericModule[T4, T5], m5: M5 with GenericModule[T5, T6]) extends GenericModule[T1, T6] with Product with Serializable
case class Seq6[T1, T2, T3, T4, T5, T6, T7, M1 <: GenericModule[T1, T2], M2 <: GenericModule[T2, T3], M3 <: GenericModule[T3, T4], M4 <: GenericModule[T4, T5], M5 <: GenericModule[T5, T6], M6 <: GenericModule[T6, T7]](m1: M1 with GenericModule[T1, T2], m2: M2 with GenericModule[T2, T3], m3: M3 with GenericModule[T3, T4], m4: M4 with GenericModule[T4, T5], m5: M5 with GenericModule[T5, T6], m6: M6 with GenericModule[T6, T7]) extends GenericModule[T1, T7] with Product with Serializable
case class SeqLinear(weight: Constant, bias: Constant) extends Module with Product with Serializable
Inputs of size (sequence length * batch * in dim) Outputs of size (sequence length * batch * output dim) Applies a linear function to each time step
case class Sequential[A, M <: GenericModule[A, A]](members: M with GenericModule[A, A]*) extends GenericModule[A, A] with Product with Serializable
class SimpleLossCalculation[I] extends LossCalculation[I]
type StatefulModule[A, B, C] = GenericModule[(A, C), (B, C)]
type StatefulModule2[A, B, C, D] = GenericModule[(A, C), (B, D)]
case class StatefulSeq2[T1, T2, T3, S1, S2, M1 <: StatefulModule[T1, T2, S1], M2 <: StatefulModule[T2, T3, S2]](m1: M1 with StatefulModule[T1, T2, S1], m2: M2 with StatefulModule[T2, T3, S2]) extends StatefulModule[T1, T3, (S1, S2)] with Product with Serializable
case class StatefulSeq3[T1, T2, T3, T4, S1, S2, S3, M1 <: StatefulModule[T1, T2, S1], M2 <: StatefulModule[T2, T3, S2], M3 <: StatefulModule[T3, T4, S3]](m1: M1 with StatefulModule[T1, T2, S1], m2: M2 with StatefulModule[T2, T3, S2], m3: M3 with StatefulModule[T3, T4, S3]) extends StatefulModule[T1, T4, (S1, S2, S3)] with Product with Serializable
case class StatefulSeq4[T1, T2, T3, T4, T5, S1, S2, S3, S4, M1 <: StatefulModule[T1, T2, S1], M2 <: StatefulModule[T2, T3, S2], M3 <: StatefulModule[T3, T4, S3], M4 <: StatefulModule[T4, T5, S4]](m1: M1 with StatefulModule[T1, T2, S1], m2: M2 with StatefulModule[T2, T3, S2], m3: M3 with StatefulModule[T3, T4, S3], m4: M4 with StatefulModule[T4, T5, S4]) extends StatefulModule[T1, T5, (S1, S2, S3, S4)] with Product with Serializable
case class StatefulSeq5[T1, T2, T3, T4, T5, T6, S1, S2, S3, S4, S5, M1 <: StatefulModule[T1, T2, S1], M2 <: StatefulModule[T2, T3, S2], M3 <: StatefulModule[T3, T4, S3], M4 <: StatefulModule[T4, T5, S4], M5 <: StatefulModule[T5, T6, S5]](m1: M1 with StatefulModule[T1, T2, S1], m2: M2 with StatefulModule[T2, T3, S2], m3: M3 with StatefulModule[T3, T4, S3], m4: M4 with StatefulModule[T4, T5, S4], m5: M5 with StatefulModule[T5, T6, S5]) extends StatefulModule[T1, T6, (S1, S2, S3, S4, S5)] with Product with Serializable
case class SupervisedModel[I, M <: GenericModule[I, Variable]](module: M with GenericModule[I, Variable], lossFunction: LossFunction, lossCalculation: LossCalculation[I] = new SimpleLossCalculation[I], printMemoryAllocations: Boolean = false)(implicit tm: TrainingMode[M]) extends Product with Serializable
implicit class ToLift[M <: Module] extends AnyRef
implicit class ToMappedState[A, B, C, M <: StatefulModule[A, B, C]] extends AnyRef
implicit class ToUnlift[A, B, C, D, M <: StatefulModule2[A, B, C, D]] extends AnyRef
implicit class ToWithInit[A, B, C, M <: StatefulModule[A, B, C]] extends AnyRef
trait TrainingMode[M] extends AnyRef
Type class about how to switch a module into training or evaluation mode
implicit class TrainingModeSyntax[M] extends AnyRef
case class TransformerEmbedding(embedding: Embedding, addPositionalEmbedding: Boolean, positionalEmbedding: Constant) extends GenericModule[Variable, (Variable, STen)] with Product with Serializable
Gradients are not computed for positionalEmbedding
case class TransformerEncoder(blocks: Seq[TransformerEncoderBlock]) extends GenericModule[(Variable, STen), Variable] with Product with Serializable
TransformerEncoder module
TransformerEncoder module
Input is (data, tokens) where data is (batch, num tokens, in dimension), double tensor tokens is (batch,num tokens) long tensor.
Output is (bach, num tokens, out dimension)
The sole purpose of tokens is to carry over the padding
case class TransformerEncoderBlock(attention: MultiheadAttention, layerNorm1: LayerNorm, layerNorm2: LayerNorm, w1: Constant, b1: Constant, w2: Constant, b2: Constant, dropout: Double, train: Boolean) extends GenericModule[(Variable, STen), Variable] with Product with Serializable
A single block of the transformer encoder as defined in Fig 10.7.1 in d2l v0.16
case class UnliftedModule[A, B, C, D, M <: StatefulModule2[A, B, C, D]](statefulModule: M with StatefulModule2[A, B, C, D])(implicit init: InitState[M, C]) extends GenericModule[A, B] with Product with Serializable
case class WeightNormLinear(weightsV: Constant, weightsG: Constant, bias: Option[Constant]) extends Module with Product with Serializable
case class WithInit[A, B, C, M <: StatefulModule[A, B, C]](module: M with StatefulModule[A, B, C], init: C) extends StatefulModule[A, B, C] with Product with Serializable
case class Yogi(parameters: Seq[(STen, PTag)], weightDecay: OptimizerHyperparameter, learningRate: OptimizerHyperparameter = simple(0.01), beta1: OptimizerHyperparameter = simple(0.9), beta2: OptimizerHyperparameter = simple(0.999), eps: Double = 1e-3, clip: Option[Double] = None, debias: Boolean = true) extends Optimizer with Product with Serializable
The Yogi optimizer algorithm I added the decoupled weight decay term following https://arxiv.org/pdf/1711.05101.pdf
The Yogi optimizer algorithm I added the decoupled weight decay term following https://arxiv.org/pdf/1711.05101.pdf

See also
https://papers.nips.cc/paper/2018/file/90365351ccc7437a1309dc64e4db32a3-Paper.pdf Algorithm 2
case class simple(v: Double) extends OptimizerHyperparameter with Product with Serializable

Value Members

def gradientClippingInPlace(gradients: Seq[Option[STen]], theta: Double): Unit
def initLinear[S](in: Int, out: Int, tOpt: STenOptions)(implicit arg0: Sc[S]): Constant
object AdamW extends Serializable
object Attention
object BatchNorm extends Serializable
object BatchNorm2D extends Serializable
object Conv1D extends Serializable
object Conv2D extends Serializable
object Conv2DTransposed extends Serializable
object Dropout extends Serializable
object EitherModule extends Serializable
object Embedding extends Serializable
object FreeRunningRNN extends Serializable
object Fun extends Serializable
object GCN extends Serializable
object GRU extends Serializable
object GenericFun extends Serializable
object GenericModule
object GraphReadout extends Serializable
object InitState
object LSTM extends Serializable
object LayerNorm extends Serializable
object LearningRateSchedule
object LiftedModule extends Serializable
object Linear extends Serializable
object Load
object LossFunctions
object MLP
Factory for multilayer fully connected feed forward networks
Factory for multilayer fully connected feed forward networks
Returned network has the following repeated structure: [linear -> batchnorm -> nonlinearity -> dropout]*
The last block does not include the nonlinearity and the dropout.
object MappedState extends Serializable
object MultiheadAttention extends Serializable
object NGCN extends Serializable
object NoTag extends LeafTag with Product with Serializable
object PTag
object Passthrough extends Serializable
object PositionalEmbedding
object RAdam extends Serializable
object RNN extends Serializable
object ResidualModule extends Serializable
object SGDW extends Serializable
object Seq2 extends Serializable
object Seq2Seq extends Serializable
object Seq2SeqWithAttention extends Serializable
object Seq3 extends Serializable
object Seq4 extends Serializable
object Seq5 extends Serializable
object Seq6 extends Serializable
object SeqLinear extends Serializable
object Sequential extends Serializable
object StatefulSeq2 extends Serializable
object StatefulSeq3 extends Serializable
object StatefulSeq4 extends Serializable
object StatefulSeq5 extends Serializable
object TrainingMode
object TransformerEmbedding extends Serializable
object TransformerEncoder extends Serializable
object TransformerEncoderBlock extends Serializable
object UnliftedModule extends Serializable
object WeightNormLinear extends Serializable
object WithInit extends Serializable
object Yogi extends Serializable
object sequence
object statefulSequence

Packages

Example

nn

package nn

Type Members

Example

Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

Example

nn 

package nn

Type Members

Example

Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

nn