Class BasicModelUtils<T extends SequenceElement>

    • Constructor Detail

      • BasicModelUtils

        public BasicModelUtils()
    • Method Detail

      • init

        public void init​(@NonNull
                         @NonNull WeightLookupTable<T> lookupTable)
        Description copied from interface: ModelUtils
        This method implementations should accept given lookup table, and use them in further calls to interface methods
        Specified by:
        init in interface ModelUtils<T extends SequenceElement>
      • similarity

        public double similarity​(@NonNull
                                 @NonNull String label1,
                                 @NonNull
                                 @NonNull String label2)
        Returns the similarity of 2 words. Result value will be in range [-1,1], where -1.0 is exact opposite similarity, i.e. NO similarity, and 1.0 is total match of two word vectors. However, most of time you'll see values in range [0,1], but that's something depends of training corpus. Returns NaN if any of labels not exists in vocab, or any label is null
        Specified by:
        similarity in interface ModelUtils<T extends SequenceElement>
        Parameters:
        label1 - the first word
        label2 - the second word
        Returns:
        a normalized similarity (cosine similarity)
      • wordsNearest

        public Collection<String> wordsNearest​(String label,
                                               int n)
        Description copied from interface: ModelUtils
        This method implementations should return N nearest elements labels to given element's label
        Specified by:
        wordsNearest in interface ModelUtils<T extends SequenceElement>
        Parameters:
        label - label to return nearest elements for
        n - number of nearest words to return
        Returns:
      • accuracy

        public Map<String,​Double> accuracy​(List<String> questions)
        Accuracy based on questions which are a space separated list of strings where the first word is the query word, the next 2 words are negative, and the last word is the predicted word to be nearest
        Specified by:
        accuracy in interface ModelUtils<T extends SequenceElement>
        Parameters:
        questions - the questions to ask
        Returns:
        the accuracy based on these questions
      • similarWordsInVocabTo

        public List<String> similarWordsInVocabTo​(String word,
                                                  double accuracy)
        Find all words with a similar characters in the vocab
        Specified by:
        similarWordsInVocabTo in interface ModelUtils<T extends SequenceElement>
        Parameters:
        word - the word to compare
        accuracy - the accuracy: 0 to 1
        Returns:
        the list of words that are similar in the vocab
      • wordsNearest

        public Collection<String> wordsNearest​(@NonNull
                                               @NonNull Collection<String> positive,
                                               @NonNull
                                               @NonNull Collection<String> negative,
                                               int top)
        Description copied from interface: ModelUtils
        Words nearest based on positive and negative words
        Specified by:
        wordsNearest in interface ModelUtils<T extends SequenceElement>
        Parameters:
        positive - the positive words
        negative - the negative words
        top - the top n words
        Returns:
        the words nearest the mean of the words
      • adjustRank

        protected org.nd4j.linalg.api.ndarray.INDArray adjustRank​(org.nd4j.linalg.api.ndarray.INDArray words)
      • wordsNearest

        public Collection<String> wordsNearest​(org.nd4j.linalg.api.ndarray.INDArray words,
                                               int top)
        Words nearest based on positive and negative words * @param top the top n words
        Specified by:
        wordsNearest in interface ModelUtils<T extends SequenceElement>
        Returns:
        the words nearest the mean of the words
      • wordsNearestSum

        public Collection<String> wordsNearestSum​(org.nd4j.linalg.api.ndarray.INDArray words,
                                                  int top)
        Words nearest based on positive and negative words * @param top the top n words
        Specified by:
        wordsNearestSum in interface ModelUtils<T extends SequenceElement>
        Returns:
        the words nearest the mean of the words