Class AbstractCache<T extends SequenceElement>

    • Constructor Detail

      • AbstractCache

        public AbstractCache()
    • Method Detail

      • vocabExists

        public boolean vocabExists()
        Returns true, if number of elements in vocabulary > 0, false otherwise
        Specified by:
        vocabExists in interface VocabCache<T extends SequenceElement>
        Returns:
      • incrementWordCount

        public void incrementWordCount​(String word)
        Increment frequency for specified label by 1
        Specified by:
        incrementWordCount in interface VocabCache<T extends SequenceElement>
        Parameters:
        word - the word to increment the count for
      • incrementWordCount

        public void incrementWordCount​(String word,
                                       int increment)
        Increment frequency for specified label by specified value
        Specified by:
        incrementWordCount in interface VocabCache<T extends SequenceElement>
        Parameters:
        word - the word to increment the count for
        increment - the amount to increment by
      • wordFrequency

        public int wordFrequency​(@NonNull
                                 @NonNull String word)
        Returns the SequenceElement's frequency over training corpus
        Specified by:
        wordFrequency in interface VocabCache<T extends SequenceElement>
        Parameters:
        word - the word to retrieve the occurrence frequency for
        Returns:
      • containsWord

        public boolean containsWord​(String word)
        Checks, if specified label exists in vocabulary
        Specified by:
        containsWord in interface VocabCache<T extends SequenceElement>
        Parameters:
        word - the word to check for
        Returns:
      • containsElement

        public boolean containsElement​(T element)
        Checks, if specified element exists in vocabulary
        Parameters:
        element -
        Returns:
      • wordAtIndex

        public String wordAtIndex​(int index)
        Returns the label of the element at specified Huffman index
        Specified by:
        wordAtIndex in interface VocabCache<T extends SequenceElement>
        Parameters:
        index - the index of the word to get
        Returns:
      • elementAtIndex

        public T elementAtIndex​(int index)
        Returns SequenceElement at specified index
        Specified by:
        elementAtIndex in interface VocabCache<T extends SequenceElement>
        Parameters:
        index -
        Returns:
      • indexOf

        public int indexOf​(String label)
        Returns Huffman index for specified label
        Specified by:
        indexOf in interface VocabCache<T extends SequenceElement>
        Parameters:
        label - the label to get index for
        Returns:
        >=0 if label exists, -1 if Huffman tree wasn't built yet, -2 if specified label wasn't found
      • setTotalWordOccurences

        public void setTotalWordOccurences​(long value)
      • wordFor

        public T wordFor​(@NonNull
                         @NonNull String label)
        Returns SequenceElement for specified label
        Specified by:
        wordFor in interface VocabCache<T extends SequenceElement>
        Parameters:
        label - to fetch element for
        Returns:
      • addWordToIndex

        public void addWordToIndex​(int index,
                                   String label)
        This method allows to insert specified label to specified Huffman tree position. CAUTION: Never use this, unless you 100% sure what are you doing.
        Specified by:
        addWordToIndex in interface VocabCache<T extends SequenceElement>
        Parameters:
        index -
        label -
      • putVocabWord

        @Deprecated
        public void putVocabWord​(String word)
        Deprecated.
        Description copied from interface: VocabCache
        Inserts the word as a vocab word (it gets the vocab word from the internal token store). Note that the index must be set on the token.
        Specified by:
        putVocabWord in interface VocabCache<T extends SequenceElement>
        Parameters:
        word - the word to add to the vocab
      • docAppearedIn

        public int docAppearedIn​(String word)
        Returns number of documents (if applicable) the label was observed in.
        Specified by:
        docAppearedIn in interface VocabCache<T extends SequenceElement>
        Parameters:
        word - the number of documents the word appeared in
        Returns:
      • incrementDocCount

        public void incrementDocCount​(String word,
                                      long howMuch)
        Increment number of documents the label was observed in Please note: this method is NOT thread-safe
        Specified by:
        incrementDocCount in interface VocabCache<T extends SequenceElement>
        Parameters:
        word - the word to increment by
        howMuch -
      • setCountForDoc

        public void setCountForDoc​(String word,
                                   long count)
        Set exact number of observed documents that contain specified word Please note: this method is NOT thread-safe
        Specified by:
        setCountForDoc in interface VocabCache<T extends SequenceElement>
        Parameters:
        word - the word to set the count for
        count - the count of the word
      • incrementTotalDocCount

        public void incrementTotalDocCount​(long by)
        Increment total number of documents observed by specified value
        Specified by:
        incrementTotalDocCount in interface VocabCache<T extends SequenceElement>
        Parameters:
        by - the number to increment by
      • setTotalDocCount

        public void setTotalDocCount​(long by)
        This method allows to set total number of documents
        Parameters:
        by -
      • tokens

        public Collection<T> tokens()
        Returns collection of SequenceElements from this vocabulary. The same as vocabWords() method
        Specified by:
        tokens in interface VocabCache<T extends SequenceElement>
        Returns:
        collection of SequenceElements
      • addToken

        public boolean addToken​(T element)
        This method adds specified SequenceElement to vocabulary
        Specified by:
        addToken in interface VocabCache<T extends SequenceElement>
        Parameters:
        element - the word to add
        Returns:
        true if token was added, false if updated
      • addToken

        public void addToken​(T element,
                             boolean lockf)
      • tokenFor

        public T tokenFor​(String label)
        Returns SequenceElement for specified label. The same as wordFor() method.
        Specified by:
        tokenFor in interface VocabCache<T extends SequenceElement>
        Parameters:
        label - the label to get the token for
        Returns:
      • hasToken

        public boolean hasToken​(String label)
        Checks, if specified label already exists in vocabulary. The same as containsWord() method.
        Specified by:
        hasToken in interface VocabCache<T extends SequenceElement>
        Parameters:
        label - the token to test
        Returns:
      • importVocabulary

        public void importVocabulary​(@NonNull
                                     @NonNull VocabCache<T> vocabCache)
        This method imports all elements from VocabCache passed as argument If element already exists,
        Specified by:
        importVocabulary in interface VocabCache<T extends SequenceElement>
        Parameters:
        vocabCache -
      • removeElement

        public void removeElement​(String label)
        Description copied from interface: VocabCache
        Removes element with specified label from vocabulary Please note: Huffman index should be updated after element removal
        Specified by:
        removeElement in interface VocabCache<T extends SequenceElement>
        Parameters:
        label - label of the element to be removed
      • removeElement

        public void removeElement​(T element)
        Description copied from interface: VocabCache
        Removes specified element from vocabulary Please note: Huffman index should be updated after element removal
        Specified by:
        removeElement in interface VocabCache<T extends SequenceElement>
        Parameters:
        element - SequenceElement to be removed
      • toJson

        public String toJson()
                      throws org.nd4j.shade.jackson.core.JsonProcessingException
        Throws:
        org.nd4j.shade.jackson.core.JsonProcessingException