lamp.autograd

Implements reverse mode automatic differentiaton

The main types in this package are lamp.autograd.Variable and lamp.autograd.Op. The computational graph built by this package consists of vertices representing values (as lamp.autograd.Variable) and vertices representing operations (as lamp.autograd.Op).

Variables contain the value of a R^n^ => R^m^ function. Variables may also contain the partial derivative of their argument with respect to a single scalar. A Variable whose value is a scalar (m=1) can trigger the computation of partial derivatives of all the intermediate upstream Variables. Computing partial derivatives with respect to non-scalar variables is not supported.

A constant Variable may be created with the const or param factory method in this package. const may be used for constants which do not need their partial derivatives to be computed. param on the other hand create Variables which will fill in their partial derivatives. Further variables may be created by the methods in this class, eventually expressing more complex R^n^ => R^m^ functions. ===Example===

lamp.Scope.root{ implicit scope =>
 // x is constant (depends on no other variables) and won't compute a partial derivative
 val x = lamp.autograd.const(STen.eye(3, STenOptions.d))
 // y is constant but will compute a partial derivative
 val y = lamp.autograd.param(STen.ones(List(3,3), STenOptions.d))

 // z is a Variable with x and y dependencies
 val z = x+y

 // w is a Variable with z as a direct and x, y as transient dependencies
 val w = z.sum
 // w is a scalar (number of elements is 1), thus we can call backprop() on it.
 // calling backprop will fill out the partial derivatives of the upstream variables
 w.backprop()

 // partialDerivative is empty since we created `x` with `const`
 assert(x.partialDerivative.isEmpty)

 // `y`'s partial derivatie is defined and is computed
 // it holds `y`'s partial derivative with respect to `w`, the scalar which we called backprop() on
 assert(y.partialDerivative.isDefined)

}

This package may be used to compute the derivative of any function, provided the function can be composed out of the provided methods. A particular use case is gradient based optimization.

See also:

https://arxiv.org/pdf/1811.05031.pdf for a review of the algorithm

lamp.autograd.Op for how to implement a new operation

Type members

Classlikes

case class Add(scope: Scope, a: Variable, b: Variable) extends Op
case class ArcTan(scope: Scope, a: Variable) extends Op
case class ArgMax(scope: Scope, a: Variable, dim: Long, keepDim: Boolean) extends Op
case class Assign(scope: Scope, abandon: Variable, keep: Variable) extends Op
object Autograd
case class AvgPool2D(scope: Scope, input: Variable, kernelSize: Long, stride: Long, padding: Long) extends Op

2D avg pooling

2D avg pooling

Value parameters:
input

batch x in_channels x h x w

case class BatchNorm(scope: Scope, input: Variable, weight: Variable, bias: Variable, runningMean: STen, runningVar: STen, training: Boolean, momentum: Double, eps: Double) extends Op
case class BatchNorm2D(scope: Scope, input: Variable, weight: Variable, bias: Variable, runningMean: STen, runningVar: STen, training: Boolean, momentum: Double, eps: Double) extends Op

Batch Norm 2D 0-th dimension are samples. 1-th are features, everything else is averaged out.

Batch Norm 2D 0-th dimension are samples. 1-th are features, everything else is averaged out.

case class BatchedMatMul(scope: Scope, a: Variable, b: Variable) extends Op
case class BinaryCrossEntropyWithLogitsLoss(scope: Scope, input: Variable, target: STen, posWeights: Option[STen], reduction: Reduction) extends Op

input: (N,T) where T>=1 are multiple independent tasks target: same shape as input, float with in [0,1] posWeight: is (T)

input: (N,T) where T>=1 are multiple independent tasks target: same shape as input, float with in [0,1] posWeight: is (T)

case class CappedShiftedNegativeExponential(scope: Scope, a: Variable, shift: Double) extends Op
case class CastToPrecision(scope: Scope, a: Variable, precision: FloatingPointPrecision) extends Op
case class Cholesky(scope: Scope, input: Variable) extends Op
case class CholeskySolve(scope: Scope, b: Variable, factor: Variable, upper: Boolean) extends Op
case class Concatenate(scope: Scope, a: Seq[Variable], dim: Long) extends Op
case class ConstAdd(scope: Scope, a: Variable, b: Double) extends Op
case class ConstMult(scope: Scope, a: Variable, b: Double) extends Op
sealed trait Constant extends Variable

A variable whose parent is empty

A variable whose parent is empty

Companion:
object
object Constant
Companion:
class
case class ConstantWithGrad(value: STen, pd: STen) extends Constant

A variable whose parent is empty but whose partial derivative is defined

A variable whose parent is empty but whose partial derivative is defined

case class ConstantWithoutGrad(value: STen) extends Constant

A variable whose parent and partial derivatives are empty

A variable whose parent and partial derivatives are empty

case class Convolution(scope: Scope, input: Variable, weight: Variable, bias: Variable, stride: Array[Long], padding: Array[Long], dilation: Array[Long], transposed: Boolean, outputPadding: Array[Long], groups: Long) extends Op

1D/2D/3D convolution

1D/2D/3D convolution

Value parameters:
bias

out_channels

input

batch x in_channels x height x width

weight

out_channels x in_channels x kernel_size x kernel_size

Returns:

Variable with Tensor of size batch x out_channels x L' (length depends on stride/padding/dilation)

case class Cos(scope: Scope, a: Variable) extends Op
case class Cross(scope: Scope, a: Variable, b: Variable, dim: Int) extends Op
case class Debug(scope: Scope, a: Variable, callback: (STen, Boolean, Boolean) => Unit) extends Op
case class Diag(scope: Scope, a: Variable, diagonal: Long) extends Op
case class Div(scope: Scope, a: Variable, b: Variable) extends Op
case class Dropout(scope: Scope, a: Variable, prob: Double, train: Boolean) extends Op
case class ElementWiseMaximum(scope: Scope, a: Variable, b: Variable) extends Op
case class ElementWiseMinimum(scope: Scope, a: Variable, b: Variable) extends Op
case class Embedding(scope: Scope, input: Variable, weight: Variable) extends Op
case class EqWhere(scope: Scope, a: Variable, b: Long) extends Op
case class EuclideanDistance(scope: Scope, a: Variable, b: Variable, dim: Int) extends Op
case class Exp(scope: Scope, a: Variable) extends Op
case class Expand(scope: Scope, a: Variable, shape: List[Long]) extends Op
case class ExpandAs(scope: Scope, a: Variable, as: STen) extends Op
case class Flatten(scope: Scope, input: Variable, startDim: Int, endDim: Int) extends Op
case class Gelu(scope: Scope, a: Variable) extends Op
case class GraphMemoryAllocationReport(parameterTensorCount: Long, parameterTensorStorage: Long, constantTensorCount: Long, constantTensorStorage: Long, intermediateTensorCount: Long, intermediateTensorStorage: Long)
case class HardSwish(scope: Scope, a: Variable) extends Op
case class IndexAdd(scope: Scope, src: Variable, index: Variable, dim: Int, maxIndex: Long) extends Op
case class IndexAddToTarget(scope: Scope, target: Variable, src: Variable, index: Variable, dim: Int) extends Op
case class IndexFill(scope: Scope, input: Variable, dim: Long, index: Variable, fill: Double) extends Op
case class IndexSelect(scope: Scope, input: Variable, dim: Long, index: Variable) extends Op
case class Inv(scope: Scope, a: Variable) extends Op
case class L1Loss(scope: Scope, input: Variable, target: STen, reduction: Reduction) extends Op
case class LayerNormOp(scope: Scope, input: Variable, weight: Variable, bias: Variable, normalizedShape: List[Long], eps: Double) extends Op
case class LeakyRelu(scope: Scope, a: Variable, slope: Double) extends Op
case class Log(scope: Scope, a: Variable) extends Op
case class Log1p(scope: Scope, a: Variable) extends Op
case class LogDet(scope: Scope, a: Variable) extends Op
case class LogSoftMax(scope: Scope, a: Variable, dim: Int) extends Op
case class MaskFill(scope: Scope, input: Variable, mask: Variable, fill: Double) extends Op
case class MaskSelect(scope: Scope, input: Variable, mask: Variable) extends Op
case class MatMul(scope: Scope, a: Variable, b: Variable) extends Op
case class MaxPool1D(scope: Scope, input: Variable, kernelSize: Long, stride: Long, padding: Long, dilation: Long) extends Op

1D max pooling

1D max pooling

Value parameters:
input

batch x in_channels x L

case class MaxPool2D(scope: Scope, input: Variable, kernelSize: Long, stride: Long, padding: Long, dilation: Long) extends Op

2D max pooling

2D max pooling

Value parameters:
input

batch x in_channels x h x w

case class Mean(scope: Scope, a: Variable, dim: List[Int], keepDim: Boolean) extends Op
Companion:
object
case object Mean extends Reduction
Companion:
class
case class Minus(scope: Scope, a: Variable, b: Variable) extends Op
case class MseLoss(scope: Scope, input: Variable, target: STen, reduction: Reduction) extends Op
case class Mult(scope: Scope, a: Variable, b: Variable) extends Op
case class NllLoss(scope: Scope, input: Variable, target: STen, weights: STen, reduction: Reduction, ignore: Long) extends Op
case object NoReduction extends Reduction
case class Norm2(scope: Scope, a: Variable, dim: List[Int], keepDim: Boolean) extends Op
case class OneHot(scope: Scope, a: Variable, numClasses: Int) extends Op
trait Op

Represents an operation in the computational graph

Represents an operation in the computational graph

===Short outline of reverse autograd from scalar values=== y = f1 o f2 o .. o fn

One of these subexpression (f_i) has value w2 and arguments w1. We can write dy/dw1 = dy/dw2 * dw2/dw1. dw2/dw1 is the Jacobian of f_i at the current value of w1. dy/dw2 is the Jacobian of y wrt to w2 at the current value of w2.

The current value of w1 and w2 are computed in a forward pass. The value dy/dy is 1 and from this dy/dw2 is recursed in the backward pass. The Jacobian function of dw2/dw1 is computed symbolically and hard coded.

The anonymous function which Ops must implement is dy/dw2 => dy/dw2 * dw2/dw1. The argument of that function (dy/dw2) is coming down from the backward pass. The Op must implement dy/dw2 * dw2/dw1.

The shape of dy/dw2 is the shape of the value of the operation (dy/dw2). The shape of dy/dw2 * dw2/dw1 is the shape of the parameter variable with respect which the derivative is taken, i.e. w1 since we are computing dy/dw1.

===How to implement an operation===

// Each concrete realization of the operation corresponds to an instance of an Op
// The Op instance holds handles to the input variables (here a, b), to be used in the backward pass
// The forward pass is effectively done in the constructor of the Op
// The backward pass is triggerd and orchestrated by [[lamp.autograd.Variable.backward]]
case class Mult(scope: Scope, a: Variable, b: Variable) extends Op {

// List all parameters which support partial derivatives, here both a and b
val params = List(
 // partial derivative of the first argument
 a.zipBackward { (p, out) =>
 // p is the incoming partial derivative, out is where the result is accumated into
 // Intermediate tensors are released due to the enclosing Scope.root
 Scope.root { implicit scope => out += (p * b.value).unbroadcast(a.sizes) }
 },
 // partial derivative of the second argument ..
 b.zipBackward { (p, out) =>
 Scope.root { implicit scope => out += (p * a.value).unbroadcast(b.sizes) }

 }
)
//The value of this operation, i.e. the forward pass
val value = Variable(this, a.value.*(b.value)(scope))(scope)

}
See also:
case class PInv(scope: Scope, a: Variable, rcond: Double) extends Op
case class Pow(scope: Scope, a: Variable, exponent: Variable) extends Op
case class PowConst(scope: Scope, a: Variable, exponent: Double) extends Op
sealed trait Reduction
case class Relu(scope: Scope, a: Variable) extends Op
case class RepeatInterleave(scope: Scope, self: Variable, repeats: Variable, dim: Int) extends Op
case class Reshape(scope: Scope, a: Variable, shape: Array[Long]) extends Op
case class ScatterAdd(scope: Scope, src: Variable, index: Variable, dim: Int, maxIndex: Long) extends Op
case class Select(scope: Scope, a: Variable, dim: Long, index: Long) extends Op
case class Sigmoid(scope: Scope, a: Variable) extends Op
case class Sin(scope: Scope, a: Variable) extends Op
case class Slice(scope: Scope, a: Variable, dim: Long, start: Long, end: Long, step: Long) extends Op
case class Softplus(scope: Scope, a: Variable, beta: Double, threshold: Double) extends Op
case class SparseFromValueAndIndex(scope: Scope, values: Variable, indices: STen, dim: Seq[Long]) extends Op
case class SquaredFrobeniusMatrixNorm(scope: Scope, a: Variable) extends Op
case class Stack(scope: Scope, a: Seq[Variable], dim: Long) extends Op
case class Sum(scope: Scope, a: Variable, dim: List[Int], keepDim: Boolean) extends Op
Companion:
object
case object Sum extends Reduction
Companion:
class
case class Tan(scope: Scope, a: Variable) extends Op
case class Tanh(scope: Scope, a: Variable) extends Op
case class ToDense(scope: Scope, sparse: Variable) extends Op
case class Transpose(scope: Scope, a: Variable, dim1: Int, dim2: Int) extends Op
object Variable
Companion:
class
sealed trait Variable

A value of a tensor valued function, a vertex in the computational graph.

A value of a tensor valued function, a vertex in the computational graph.

A Variable may be constant, i.e. depends on no other Variables. Constant variables may or may not need their partial derivatives computed.

Companion:
object
case class VariableNonConstant(op1: Op, value: STen, pd: STen) extends Variable

A variable whose parent is not empty, neither its partial derivative

A variable whose parent is not empty, neither its partial derivative

Companion:
object
Companion:
class
case class Variance(scope: Scope, a: Variable, dim: List[Int]) extends Op
case class View(scope: Scope, a: Variable, shape: Array[Long]) extends Op
case class WeightNorm(scope: Scope, v: Variable, g: Variable, dim: Long) extends Op
case class Where(scope: Scope, condition: STen, trueBranch: Variable, falseBranch: Variable) extends Op

Value members

Concrete methods

def const(m: STen): Constant
def const(m: Double, tOpt: STenOptions)(implicit scope: Scope): Constant
def param(m: STen)(implicit scope: Scope): ConstantWithGrad
def param(m: Double, tOpt: STenOptions)(implicit scope: Scope): ConstantWithGrad