TensorflowElmo

class TensorflowElmo extends Serializable

Embeddings from a language model trained on the 1 Billion Word Benchmark.

Note that this is a very computationally expensive module compared to word embedding modules that only perform embedding lookups. The use of an accelerator is recommended.

word_emb: the character-based word representations with shape [batch_size, max_length, 512]. == word_emb

lstm_outputs1: the first LSTM hidden state with shape [batch_size, max_length, 1024]. \=== lstm_outputs1

lstm_outputs2: the second LSTM hidden state with shape [batch_size, max_length, 1024]. \=== lstm_outputs2

elmo: the weighted sum of the 3 layers, where the weights are trainable. This tensor has shape [batch_size, max_length, 1024] == elmo

See https://github.com/JohnSnowLabs/spark-nlp/blob/master/src/test/scala/com/johnsnowlabs/nlp/embeddings/ElmoEmbeddingsTestSpec.scala for further reference on how to use this API.

Linear Supertypes

Serializable, Serializable, AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

TensorflowElmo
Serializable
Serializable
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Instance Constructors

new TensorflowElmo(tensorflow: TensorflowWrapper, batchSize: Int, configProtoBytes: Option[Array[Byte]] = None)
tensorflow
Elmo Model wrapper with TensorFlow Wrapper
batchSize
size of batch
configProtoBytes
Configuration for TensorFlow session Sources : https://tfhub.dev/google/elmo/3 https://arxiv.org/abs/1802.05365

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def getDimensions: (String) ⇒ Int
word_emb: the character-based word representations with shape [batch_size, max_length, 512].
word_emb: the character-based word representations with shape [batch_size, max_length, 512]. \== 512
lstm_outputs1: the first LSTM hidden state with shape [batch_size, max_length, 1024]. === 1024
lstm_outputs2: the second LSTM hidden state with shape [batch_size, max_length, 1024]. === 1024
elmo: the weighted sum of the 3 layers, where the weights are trainable. This tensor has shape [batch_size, max_length, 1024] == 1024
returns
The dimension of chosen layer
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
def predict(sentences: Seq[TokenizedSentence], poolingLayer: String): Seq[WordpieceEmbeddingsSentence]
Calculate the embeddigns for a sequence of Tokens and create WordPieceEmbeddingsSentence objects from them
Calculate the embeddigns for a sequence of Tokens and create WordPieceEmbeddingsSentence objects from them
sentences
A sequence of Tokenized Sentences for which embeddings will be calculated
poolingLayer
Define which output layer you want from the model word_emb, lstm_outputs1, lstm_outputs2, elmo. See https://tfhub.dev/google/elmo/3 for reference
returns
A Seq of WordpieceEmbeddingsSentence, one element for each input sentence
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def tag(batch: Seq[TokenizedSentence], embeddingsKey: String, dimension: Int): Seq[Array[Array[Float]]]
Tag a seq of TokenizedSentences, will get the embeddings according to key.
Tag a seq of TokenizedSentences, will get the embeddings according to key.
batch
The Tokens for which we calculate embeddings
embeddingsKey
Specification of the output embedding for Elmo
dimension
Elmo's embeddings dimension: either 512 or 1024
returns
The Embeddings Vector. For each Seq Element we have a Sentence, and for each sentence we have an Array for each of its words. Each of its words gets a float array to represent its Embeddings
val tensorflow: TensorflowWrapper
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... ) @native()

Packages

TensorflowElmo

class TensorflowElmo extends Serializable

Instance Constructors

Value Members

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

TensorflowElmo 

class TensorflowElmo extends Serializable

Instance Constructors

Value Members

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped

TensorflowElmo