Package

com.johnsnowlabs.nlp

embeddings

Permalink

package embeddings

Visibility
  1. Public
  2. All

Type Members

  1. class AlbertEmbeddings extends AnnotatorModel[AlbertEmbeddings] with WriteTensorflowModel with WriteSentencePieceModel with HasEmbeddingsProperties with HasStorageRef with HasCaseSensitiveProperties

    Permalink

    ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS - Google Research, Toyota Technological Institute at Chicago This these embeddings represent the outputs generated by the Albert model.

    ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS - Google Research, Toyota Technological Institute at Chicago This these embeddings represent the outputs generated by the Albert model. All offical Albert releases by google in TF-HUB are supported with this Albert Wrapper:

    TF-HUB Models : albert_base = https://tfhub.dev/google/albert_base/3 | 768-embed-dim, 12-layer, 12-heads, 12M parameters albert_large = https://tfhub.dev/google/albert_large/3 | 1024-embed-dim, 24-layer, 16-heads, 18M parameters albert_xlarge = https://tfhub.dev/google/albert_xlarge/3 | 2048-embed-dim, 24-layer, 32-heads, 60M parameters albert_xxlarge = https://tfhub.dev/google/albert_xxlarge/3 | 4096-embed-dim, 12-layer, 64-heads, 235M parameters

    This model requires input tokenization with SentencePiece model, which is provided by Spark-NLP (See tokenizers package)

    For additional information see : https://arxiv.org/pdf/1909.11942.pdf https://github.com/google-research/ALBERT https://tfhub.dev/s?q=albert

    Tips:

    ALBERT uses repeating layers which results in a small memory footprint, however the computational cost remains similar to a BERT-like architecture with the same number of hidden layers as it has to iterate through the same number of (repeating) layers.

  2. class BertEmbeddings extends AnnotatorModel[BertEmbeddings] with WriteTensorflowModel with HasEmbeddingsProperties with HasStorageRef with HasCaseSensitiveProperties

    Permalink
  3. class ChunkEmbeddings extends AnnotatorModel[ChunkEmbeddings]

    Permalink
  4. class ElmoEmbeddings extends AnnotatorModel[ElmoEmbeddings] with WriteTensorflowModel with HasEmbeddingsProperties with HasStorageRef with HasCaseSensitiveProperties

    Permalink

    Embeddings from a language model trained on the 1 Billion Word Benchmark.

    Embeddings from a language model trained on the 1 Billion Word Benchmark.

    Note that this is a very computationally expensive module compared to word embedding modules that only perform embedding lookups. The use of an accelerator is recommended.

  5. trait EmbeddingsCoverage extends AnyRef

    Permalink
  6. trait HasEmbeddingsProperties extends Params

    Permalink
  7. trait ReadAlbertTensorflowModel extends ReadTensorflowModel with ReadSentencePieceModel

    Permalink
  8. trait ReadBertTensorflowModel extends ReadTensorflowModel

    Permalink
  9. trait ReadElmoTensorflowModel extends ReadTensorflowModel

    Permalink
  10. trait ReadUSETensorflowModel extends ReadTensorflowModel

    Permalink
  11. trait ReadXlnetTensorflowModel extends ReadTensorflowModel with ReadSentencePieceModel

    Permalink
  12. trait ReadablePretrainedAlbertModel extends ParamsAndFeaturesReadable[AlbertEmbeddings] with HasPretrained[AlbertEmbeddings]

    Permalink
  13. trait ReadablePretrainedBertModel extends ParamsAndFeaturesReadable[BertEmbeddings] with HasPretrained[BertEmbeddings]

    Permalink
  14. trait ReadablePretrainedElmoModel extends ParamsAndFeaturesReadable[ElmoEmbeddings] with HasPretrained[ElmoEmbeddings]

    Permalink
  15. trait ReadablePretrainedUSEModel extends ParamsAndFeaturesReadable[UniversalSentenceEncoder] with HasPretrained[UniversalSentenceEncoder]

    Permalink
  16. trait ReadablePretrainedWordEmbeddings extends StorageReadable[WordEmbeddingsModel] with HasPretrained[WordEmbeddingsModel]

    Permalink
  17. trait ReadablePretrainedXlnetModel extends ParamsAndFeaturesReadable[XlnetEmbeddings] with HasPretrained[XlnetEmbeddings]

    Permalink
  18. trait ReadsFromBytes extends AnyRef

    Permalink
  19. class SentenceEmbeddings extends AnnotatorModel[SentenceEmbeddings] with HasEmbeddingsProperties with HasStorageRef

    Permalink
  20. class UniversalSentenceEncoder extends AnnotatorModel[UniversalSentenceEncoder] with HasEmbeddingsProperties with HasStorageRef with WriteTensorflowModel

    Permalink
  21. class WordEmbeddings extends AnnotatorApproach[WordEmbeddingsModel] with HasStorage with HasEmbeddingsProperties

    Permalink
  22. class WordEmbeddingsModel extends AnnotatorModel[WordEmbeddingsModel] with HasEmbeddingsProperties with HasStorageModel with ParamsAndFeaturesWritable

    Permalink
  23. class WordEmbeddingsReader extends StorageReader[Array[Float]] with ReadsFromBytes

    Permalink
  24. class WordEmbeddingsWriter extends StorageBatchWriter[Array[Float]] with ReadsFromBytes

    Permalink
  25. class XlnetEmbeddings extends AnnotatorModel[XlnetEmbeddings] with WriteTensorflowModel with WriteSentencePieceModel with HasEmbeddingsProperties with HasStorageRef with HasCaseSensitiveProperties

    Permalink

    XlnetEmbeddings (XLNet): Generalized Autoregressive Pretraining for Language Understanding

    XlnetEmbeddings (XLNet): Generalized Autoregressive Pretraining for Language Understanding

    Note that this is a very computationally expensive module compared to word embedding modules that only perform embedding lookups. The use of an accelerator is recommended.

Value Members

  1. object AlbertEmbeddings extends ReadablePretrainedAlbertModel with ReadAlbertTensorflowModel with ReadSentencePieceModel with Serializable

    Permalink
  2. object BertEmbeddings extends ReadablePretrainedBertModel with ReadBertTensorflowModel with Serializable

    Permalink
  3. object ChunkEmbeddings extends DefaultParamsReadable[ChunkEmbeddings] with Serializable

    Permalink
  4. object ElmoEmbeddings extends ReadablePretrainedElmoModel with ReadElmoTensorflowModel with Serializable

    Permalink
  5. object PoolingStrategy

    Permalink
  6. object SentenceEmbeddings extends DefaultParamsReadable[SentenceEmbeddings] with Serializable

    Permalink
  7. object UniversalSentenceEncoder extends ReadablePretrainedUSEModel with ReadUSETensorflowModel with Serializable

    Permalink
  8. object WordEmbeddings extends DefaultParamsReadable[WordEmbeddings] with Serializable

    Permalink
  9. object WordEmbeddingsBinaryIndexer

    Permalink
  10. object WordEmbeddingsModel extends ReadablePretrainedWordEmbeddings with EmbeddingsCoverage with Serializable

    Permalink
  11. object WordEmbeddingsTextIndexer

    Permalink
  12. object XlnetEmbeddings extends ReadablePretrainedXlnetModel with ReadXlnetTensorflowModel with ReadSentencePieceModel with Serializable

    Permalink

Ungrouped