Interface WordVectors

    • Method Detail

      • setUNK

        void setUNK​(String newUNK)
      • hasWord

        boolean hasWord​(String word)
        Returns true if the model has this word in the vocab
        Parameters:
        word - the word to test for
        Returns:
        true if the model has the word in the vocab
      • wordsNearest

        Collection<String> wordsNearest​(org.nd4j.linalg.api.ndarray.INDArray words,
                                        int top)
      • wordsNearestSum

        Collection<String> wordsNearestSum​(org.nd4j.linalg.api.ndarray.INDArray words,
                                           int top)
      • wordsNearestSum

        Collection<String> wordsNearestSum​(String word,
                                           int n)
        Get the top n words most similar to the given word
        Parameters:
        word - the word to compare
        n - the n to get
        Returns:
        the top n words
      • wordsNearestSum

        Collection<String> wordsNearestSum​(Collection<String> positive,
                                           Collection<String> negative,
                                           int top)
        Words nearest based on positive and negative words
        Parameters:
        positive - the positive words
        negative - the negative words
        top - the top n words
        Returns:
        the words nearest the mean of the words
      • accuracy

        Map<String,​Double> accuracy​(List<String> questions)
        Accuracy based on questions which are a space separated list of strings where the first word is the query word, the next 2 words are negative, and the last word is the predicted word to be nearest
        Parameters:
        questions - the questions to ask
        Returns:
        the accuracy based on these questions
      • indexOf

        int indexOf​(String word)
      • similarWordsInVocabTo

        List<String> similarWordsInVocabTo​(String word,
                                           double accuracy)
        Find all words with a similar characters in the vocab
        Parameters:
        word - the word to compare
        accuracy - the accuracy: 0 to 1
        Returns:
        the list of words that are similar in the vocab
      • getWordVector

        double[] getWordVector​(String word)
        Get the word vector for a given matrix
        Parameters:
        word - the word to get the matrix for
        Returns:
        the ndarray for this word
      • getWordVectorMatrixNormalized

        org.nd4j.linalg.api.ndarray.INDArray getWordVectorMatrixNormalized​(String word)
        Returns the word vector divided by the norm2 of the array
        Parameters:
        word - the word to get the matrix for
        Returns:
        the looked up matrix
      • getWordVectorMatrix

        org.nd4j.linalg.api.ndarray.INDArray getWordVectorMatrix​(String word)
        Get the word vector for a given matrix
        Parameters:
        word - the word to get the matrix for
        Returns:
        the ndarray for this word
      • getWordVectors

        org.nd4j.linalg.api.ndarray.INDArray getWordVectors​(Collection<String> labels)
        This method returns 2D array, where each row represents corresponding word/label
        Parameters:
        labels -
        Returns:
      • getWordVectorsMean

        org.nd4j.linalg.api.ndarray.INDArray getWordVectorsMean​(Collection<String> labels)
        This method returns mean vector, built from words/labels passed in
        Parameters:
        labels -
        Returns:
      • wordsNearest

        Collection<String> wordsNearest​(Collection<String> positive,
                                        Collection<String> negative,
                                        int top)
        Words nearest based on positive and negative words
        Parameters:
        positive - the positive words
        negative - the negative words
        top - the top n words
        Returns:
        the words nearest the mean of the words
      • wordsNearest

        Collection<String> wordsNearest​(String word,
                                        int n)
        Get the top n words most similar to the given word
        Parameters:
        word - the word to compare
        n - the n to get
        Returns:
        the top n words
      • similarity

        double similarity​(String word,
                          String word2)
        Returns the similarity of 2 words
        Parameters:
        word - the first word
        word2 - the second word
        Returns:
        a normalized similarity (cosine similarity)
      • vocab

        VocabCache vocab()
        Vocab for the vectors
        Returns:
      • setModelUtils

        void setModelUtils​(ModelUtils utils)
        Specifies ModelUtils to be used to access model
        Parameters:
        utils -
      • outOfVocabularySupported

        boolean outOfVocabularySupported()
        Does implementation vectorize words absent in vocabulary
        Returns:
        boolean