lamp.nn.bert.MaskedLanguageModelModule
Masked Language Model Input of (embedding, positions) Embedding of size (batch, num tokens, embedding dim) Positions of size (batch, max num tokens) long tensor indicating which positions to make predictions on Output (batch, len(Positions), vocabulary size)
Attributes
Companion
object
Graph
Reset zoom Hide graph Show graph
Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
Members list
The implementation of the function.
The implementation of the function.
In addition of x
it can also use all the `state to compute its value.
Attributes
List of optimizable, or non-optimizable, but stateful parameters
List of optimizable, or non-optimizable, but stateful parameters
Stateful means that the state is carried over the repeated forward calls.
Attributes
Computes the gradient of loss with respect to the parameters.
Computes the gradient of loss with respect to the parameters.
Attributes
Inherited from:
GenericModule
Returns the total number of optimizable parameters.
Returns the total number of optimizable parameters.
Attributes
Inherited from:
GenericModule
Returns the state variables which need gradient computation.
Returns the state variables which need gradient computation.
Attributes
Inherited from:
GenericModule
Attributes
Inherited from:
Product
Attributes
Inherited from:
Product