Op

trait Op

Represents an operation in the computational graph

===Short outline of reverse autograd from scalar values=== y = f1 o f2 o .. o fn

One of these subexpression (f_i) has value w2 and arguments w1. We can write dy/dw1 = dy/dw2 * dw2/dw1. dw2/dw1 is the Jacobian of f_i at the current value of w1. dy/dw2 is the Jacobian of y wrt to w2 at the current value of w2.

The current value of w1 and w2 are computed in a forward pass. The value dy/dy is 1 and from this dy/dw2 is recursed in the backward pass. The Jacobian function of dw2/dw1 is computed symbolically and hard coded.

The anonymous function which Ops must implement is dy/dw2 => dy/dw2 * dw2/dw1. The argument of that function (dy/dw2) is coming down from the backward pass. The Op must implement dy/dw2 * dw2/dw1.

The shape of dy/dw2 is the shape of the value of the operation (dy/dw2). The shape of dy/dw2 * dw2/dw1 is the shape of the parameter variable with respect which the derivative is taken, i.e. w1 since we are computing dy/dw1.

===How to implement an operation===

// Each concrete realization of the operation corresponds to an instance of an Op
// The Op instance holds handles to the input variables (here a, b), to be used in the backward pass
// The forward pass is effectively done in the constructor of the Op
// The backward pass is triggerd and orchestrated by [[lamp.autograd.Variable.backward]]
case class Mult(scope: Scope, a: Variable, b: Variable) extends Op {

// List all parameters which support partial derivatives, here both a and b
val params = List(
 // partial derivative of the first argument
 a.zipBackward { (p, out) =>
 // p is the incoming partial derivative, out is where the result is accumated into
 // Intermediate tensors are released due to the enclosing Scope.root
 Scope.root { implicit scope => out += (p * b.value).unbroadcast(a.sizes) }
 },
 // partial derivative of the second argument ..
 b.zipBackward { (p, out) =>
 Scope.root { implicit scope => out += (p * a.value).unbroadcast(b.sizes) }

 }
)
//The value of this operation, i.e. the forward pass
val value = Variable(this, a.value.*(b.value)(scope))(scope)

}

See also:: https://en.wikipedia.org/wiki/Automatic_differentiation#Reverse_accumulation

http://www.cs.cmu.edu/~wcohen/10-605/notes/autodiff.pdf

class Object

trait Matchable

class Any

class Add

class ArcTan

class ArgMax

class Assign

class AvgPool2D

class BatchNorm

class BatchNorm2D

class BatchedMatMul

class BinaryCrossEntropyWithLogitsLoss

class CappedShiftedNegativeExponential

class CastToPrecision

class Cholesky

class CholeskySolve

class Concatenate

class ConstAdd

class ConstMult

class Convolution

class Cos

class Cross

class Debug

class Diag

class Div

class Dropout

class ElementWiseMaximum

class ElementWiseMinimum

class Embedding

class EqWhere

class EuclideanDistance

class Exp

class Expand

class ExpandAs

class Flatten

class Gelu

class HardSwish

class IndexAdd

class IndexAddToTarget

class IndexFill

class IndexSelect

class Inv

class L1Loss

class LayerNormOp

class LeakyRelu

class Log

class Log1p

class LogDet

class LogSoftMax

class MaskFill

class MaskSelect

class MatMul

class MaxPool1D

class MaxPool2D

class Mean

class Minus

class MseLoss

class Mult

class NllLoss

class Norm2

class OneHot

class PInv

class Pow

class PowConst

class Relu

class RepeatInterleave

class Reshape

class ScatterAdd

class Select

class Sigmoid

class Sin

class Slice

class Softplus

class SparseFromValueAndIndex

class SquaredFrobeniusMatrixNorm

class Stack

class Sum

class Tan

class Tanh

class ToDense

class Transpose

class Variance

class View

class WeightNorm

class Where

Value members

Abstract fields

Implementation of the backward pass

A list of input variables paired up with an anonymous function computing the respective partial derivative. With the notation in the documentation of the trait lamp.autograd.Op: dy/dw2 => dy/dw2 * dw2/dw1. The first argument of the anonymous function is the incoming partial derivative (dy/dw2), the second argument is the output tensor into which the result (dy/dw2 * dw2/dw1) is accumulated (added).

If the operation does not support computing the partial derivative for some of its arguments, then do not include that argument in this list.

See also:: The documentation on the trait lamp.autograd.Op for more details and example.

The value of this operation