Class FastText

    • Constructor Detail

      • FastText

        public FastText​(File modelPath)
      • FastText

        public FastText()
    • Method Detail

      • fit

        public void fit()
      • loadIterator

        public void loadIterator()
      • loadPretrainedVectors

        public void loadPretrainedVectors​(File vectorsFile)
      • loadBinaryModel

        public void loadBinaryModel​(String modelPath)
      • unloadBinaryModel

        public void unloadBinaryModel()
      • test

        public void test​(File testFile)
      • predictProbability

        public org.nd4j.common.primitives.Pair<String,​Float> predictProbability​(String text)
      • vocabSize

        public long vocabSize()
        Specified by:
        vocabSize in interface org.deeplearning4j.nn.weights.embeddings.EmbeddingInitializer
      • getWordVector

        public double[] getWordVector​(String word)
        Description copied from interface: WordVectors
        Get the word vector for a given matrix
        Specified by:
        getWordVector in interface WordVectors
        Parameters:
        word - the word to get the matrix for
        Returns:
        the ndarray for this word
      • getWordVectorMatrixNormalized

        public org.nd4j.linalg.api.ndarray.INDArray getWordVectorMatrixNormalized​(String word)
        Description copied from interface: WordVectors
        Returns the word vector divided by the norm2 of the array
        Specified by:
        getWordVectorMatrixNormalized in interface WordVectors
        Parameters:
        word - the word to get the matrix for
        Returns:
        the looked up matrix
      • getWordVectorMatrix

        public org.nd4j.linalg.api.ndarray.INDArray getWordVectorMatrix​(String word)
        Description copied from interface: WordVectors
        Get the word vector for a given matrix
        Specified by:
        getWordVectorMatrix in interface WordVectors
        Parameters:
        word - the word to get the matrix for
        Returns:
        the ndarray for this word
      • getWordVectors

        public org.nd4j.linalg.api.ndarray.INDArray getWordVectors​(Collection<String> labels)
        Description copied from interface: WordVectors
        This method returns 2D array, where each row represents corresponding word/label
        Specified by:
        getWordVectors in interface WordVectors
        Returns:
      • getWordVectorsMean

        public org.nd4j.linalg.api.ndarray.INDArray getWordVectorsMean​(Collection<String> labels)
        Description copied from interface: WordVectors
        This method returns mean vector, built from words/labels passed in
        Specified by:
        getWordVectorsMean in interface WordVectors
        Returns:
      • hasWord

        public boolean hasWord​(String word)
        Description copied from interface: WordVectors
        Returns true if the model has this word in the vocab
        Specified by:
        hasWord in interface WordVectors
        Parameters:
        word - the word to test for
        Returns:
        true if the model has the word in the vocab
      • wordsNearestSum

        public Collection<String> wordsNearestSum​(String word,
                                                  int n)
        Description copied from interface: WordVectors
        Get the top n words most similar to the given word
        Specified by:
        wordsNearestSum in interface WordVectors
        Parameters:
        word - the word to compare
        n - the n to get
        Returns:
        the top n words
      • wordsNearestSum

        public Collection<String> wordsNearestSum​(Collection<String> positive,
                                                  Collection<String> negative,
                                                  int top)
        Description copied from interface: WordVectors
        Words nearest based on positive and negative words
        Specified by:
        wordsNearestSum in interface WordVectors
        Parameters:
        positive - the positive words
        negative - the negative words
        top - the top n words
        Returns:
        the words nearest the mean of the words
      • accuracy

        public Map<String,​Double> accuracy​(List<String> questions)
        Description copied from interface: WordVectors
        Accuracy based on questions which are a space separated list of strings where the first word is the query word, the next 2 words are negative, and the last word is the predicted word to be nearest
        Specified by:
        accuracy in interface WordVectors
        Parameters:
        questions - the questions to ask
        Returns:
        the accuracy based on these questions
      • similarWordsInVocabTo

        public List<String> similarWordsInVocabTo​(String word,
                                                  double accuracy)
        Description copied from interface: WordVectors
        Find all words with a similar characters in the vocab
        Specified by:
        similarWordsInVocabTo in interface WordVectors
        Parameters:
        word - the word to compare
        accuracy - the accuracy: 0 to 1
        Returns:
        the list of words that are similar in the vocab
      • wordsNearest

        public Collection<String> wordsNearest​(Collection<String> positive,
                                               Collection<String> negative,
                                               int top)
        Description copied from interface: WordVectors
        Words nearest based on positive and negative words
        Specified by:
        wordsNearest in interface WordVectors
        Parameters:
        positive - the positive words
        negative - the negative words
        top - the top n words
        Returns:
        the words nearest the mean of the words
      • wordsNearest

        public Collection<String> wordsNearest​(String word,
                                               int n)
        Description copied from interface: WordVectors
        Get the top n words most similar to the given word
        Specified by:
        wordsNearest in interface WordVectors
        Parameters:
        word - the word to compare
        n - the n to get
        Returns:
        the top n words
      • similarity

        public double similarity​(String word,
                                 String word2)
        Description copied from interface: WordVectors
        Returns the similarity of 2 words
        Specified by:
        similarity in interface WordVectors
        Parameters:
        word - the first word
        word2 - the second word
        Returns:
        a normalized similarity (cosine similarity)
      • loadWeightsInto

        public void loadWeightsInto​(org.nd4j.linalg.api.ndarray.INDArray array)
        Specified by:
        loadWeightsInto in interface org.deeplearning4j.nn.weights.embeddings.EmbeddingInitializer
      • vectorSize

        public int vectorSize()
        Specified by:
        vectorSize in interface org.deeplearning4j.nn.weights.embeddings.EmbeddingInitializer
      • jsonSerializable

        public boolean jsonSerializable()
        Specified by:
        jsonSerializable in interface org.deeplearning4j.nn.weights.embeddings.EmbeddingInitializer
      • getLearningRate

        public double getLearningRate()
      • getDimension

        public int getDimension()
      • getContextWindowSize

        public int getContextWindowSize()
      • getEpoch

        public int getEpoch()
      • getNegativesNumber

        public int getNegativesNumber()
      • getWordNgrams

        public int getWordNgrams()
      • getLossName

        public String getLossName()
      • getModelName

        public String getModelName()
      • getNumberOfBuckets

        public int getNumberOfBuckets()
      • getLabelPrefix

        public String getLabelPrefix()
      • outOfVocabularySupported

        public boolean outOfVocabularySupported()
        Description copied from interface: WordVectors
        Does implementation vectorize words absent in vocabulary
        Specified by:
        outOfVocabularySupported in interface WordVectors
        Returns:
        boolean