Learnable mapping from classes to dense vectors. Equivalent to L * W where L is the n x C one-hot encoded matrix of the classes * is matrix multiplication W is the C x dim dense matrix. W is learnable. L is never computed directly. C is the number of classes. n is the size of the batch.
Input is a long tensor with values in [0,C-1]. Input shape is arbitrary, (). Output shape is ( x D) where D is the embedding dimension.
- Companion:
- object
Value members
Inherited methods
Computes the gradient of loss with respect to the parameters.
Computes the gradient of loss with respect to the parameters.
- Inherited from:
- GenericModule
Returns the total number of optimizable parameters.
Returns the total number of optimizable parameters.
- Inherited from:
- GenericModule
Returns the state variables which need gradient computation.
Returns the state variables which need gradient computation.
- Inherited from:
- GenericModule