MultiheadAttention

Companion object MultiheadAttention

case class MultiheadAttention(wQ: Constant, wK: Constant, wV: Constant, wO: Constant, dropout: Double, train: Boolean, numHeads: Int, padToken: Long, linearized: Boolean) extends GenericModule[(Variable, Variable, Variable, STen), Variable] with Product with Serializable

Multi-head scaled dot product attention module

Input: (query,key,value,tokens) where query: batch x num queries x query dim key: batch x num k-v x key dim value: batch x num k-v x key value tokens: batch x num queries, long type

Tokens is used to carry over padding information and ignore the padding

Linear Supertypes

Serializable, Serializable, Product, Equals, GenericModule[(Variable, Variable, Variable, STen), Variable], AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

MultiheadAttention
Serializable
Serializable
Product
Equals
GenericModule
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Instance Constructors

new MultiheadAttention(wQ: Constant, wK: Constant, wV: Constant, wO: Constant, dropout: Double, train: Boolean, numHeads: Int, padToken: Long, linearized: Boolean)

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def apply[S](a: (Variable, Variable, Variable, STen))(implicit arg0: Sc[S]): Variable
Alias of forward
Alias of forward

Definition Classes
GenericModule
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
val dropout: Double
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def forward[S](x: (Variable, Variable, Variable, STen))(implicit arg0: Sc[S]): Variable
The implementation of the function.
The implementation of the function.
In addition of x it can also use all the state to compute its value.

Definition Classes
MultiheadAttention → GenericModule
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
final def gradients(loss: Variable, zeroGrad: Boolean = true): Seq[Option[STen]]
Computes the gradient of loss with respect to the parameters.
Computes the gradient of loss with respect to the parameters.

Definition Classes
GenericModule
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def learnableParameters: Long
Returns the total number of optimizable parameters.
Returns the total number of optimizable parameters.

Definition Classes
GenericModule
val linearized: Boolean
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
val numHeads: Int
val padToken: Long
final def parameters: Seq[(Constant, PTag)]
Returns the state variables which need gradient computation.
Returns the state variables which need gradient computation.

Definition Classes
GenericModule
val state: List[(Constant, LeafTag with Product with Serializable)]
List of optimizable, or non-optimizable, but stateful parameters
List of optimizable, or non-optimizable, but stateful parameters
Stateful means that the state is carried over the repeated forward calls.

Definition Classes
MultiheadAttention → GenericModule
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
val train: Boolean
val wK: Constant
val wO: Constant
val wQ: Constant
val wV: Constant
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
final def zeroGrad(): Unit

Definition Classes
GenericModule

Packages

MultiheadAttention

Companion object MultiheadAttention

case class MultiheadAttention(wQ: Constant, wK: Constant, wV: Constant, wO: Constant, dropout: Double, train: Boolean, numHeads: Int, padToken: Long, linearized: Boolean) extends GenericModule[(Variable, Variable, Variable, STen), Variable] with Product with Serializable

Instance Constructors

Value Members

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from GenericModule[(Variable, Variable, Variable, STen), Variable]

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

MultiheadAttention 

Companion object MultiheadAttention

case class MultiheadAttention(wQ: Constant, wK: Constant, wV: Constant, wO: Constant, dropout: Double, train: Boolean, numHeads: Int, padToken: Long, linearized: Boolean) extends GenericModule[(Variable, Variable, Variable, STen), Variable] with Product with Serializable

Instance Constructors

Value Members

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from GenericModule[(Variable, Variable, Variable, STen), Variable]

Inherited from AnyRef

Inherited from Any

Ungrouped

MultiheadAttention