Interface InvertedIndex<T extends SequenceElement>

    • Method Detail

      • batchIter

        Iterator<List<List<T>>> batchIter​(int batchSize)
        Iterate over batches
        Returns:
        the batch size
      • unlock

        void unlock()
        Unlock the index
      • cleanup

        void cleanup()
        Cleanup any resources used
      • sample

        double sample()
        Sampling for creating mini batches
        Returns:
        the sampling for mini batches
      • miniBatches

        Iterator<List<T>> miniBatches()
        Iterates over mini batches
        Returns:
        the mini batches created by this vectorizer
      • document

        List<T> document​(int index)
        Returns a list of words for a document
        Parameters:
        index -
        Returns:
      • documentWithLabel

        org.nd4j.common.primitives.Pair<List<T>,​String> documentWithLabel​(int index)
        Returns a list of words for a document and the associated label
        Parameters:
        index -
        Returns:
      • documentWithLabels

        org.nd4j.common.primitives.Pair<List<T>,​Collection<String>> documentWithLabels​(int index)
        Returns a list of words associated with the document and the associated labels
        Parameters:
        index -
        Returns:
      • documents

        int[] documents​(T vocabWord)
        Returns the list of documents a vocab word is in
        Parameters:
        vocabWord - the vocab word to get documents for
        Returns:
        the documents for a vocab word
      • numDocuments

        int numDocuments()
        Returns the number of documents
        Returns:
      • allDocs

        int[] allDocs()
        Returns a list of all documents
        Returns:
        the list of all documents
      • addWordToDoc

        void addWordToDoc​(int doc,
                          T word)
        Add word to a document
        Parameters:
        doc - the document to add to
        word - the word to add
      • addWordsToDoc

        void addWordsToDoc​(int doc,
                           List<T> words)
        Adds words to the given document
        Parameters:
        doc - the document to add to
        words - the words to add
      • addLabelForDoc

        void addLabelForDoc​(int doc,
                            T word)
        Add word to a document
        Parameters:
        doc - the document to add to
        word - the word to add
      • addLabelForDoc

        void addLabelForDoc​(int doc,
                            String label)
        Adds words to the given document
        Parameters:
        doc - the document to add to
      • addWordsToDoc

        void addWordsToDoc​(int doc,
                           List<T> words,
                           String label)
        Adds words to the given document
        Parameters:
        doc - the document to add to
        words - the words to add
        label - the label for the document
      • addWordsToDoc

        void addWordsToDoc​(int doc,
                           List<T> words,
                           T label)
        Adds words to the given document
        Parameters:
        doc - the document to add to
        words - the words to add
        label - the label for the document
      • addLabelsForDoc

        void addLabelsForDoc​(int doc,
                             List<T> word)
        Add word to a document
        Parameters:
        doc - the document to add to
        word - the word to add
      • addLabelsForDoc

        void addLabelsForDoc​(int doc,
                             Collection<String> label)
        Adds words to the given document
        Parameters:
        doc - the document to add to
        label - the labels to add
      • addWordsToDoc

        void addWordsToDoc​(int doc,
                           List<T> words,
                           Collection<String> label)
        Adds words to the given document
        Parameters:
        doc - the document to add to
        words - the words to add
        label - the label for the document
      • addWordsToDocVocabWord

        void addWordsToDocVocabWord​(int doc,
                                    List<T> words,
                                    Collection<T> label)
        Adds words to the given document
        Parameters:
        doc - the document to add to
        words - the words to add
        label - the label for the document
      • finish

        void finish()
        Finishes saving data
      • totalWords

        long totalWords()
        Total number of words in the index
        Returns:
        the total number of words in the index
      • batchSize

        int batchSize()
        For word vectors, this is the batch size for which to train on
        Returns:
        the batch size for which to train on
      • eachDocWithLabels

        void eachDocWithLabels​(org.nd4j.shade.guava.base.Function<org.nd4j.common.primitives.Pair<List<T>,​Collection<String>>,​Void> func,
                               Executor exec)
        Iterate over each document with a label
        Parameters:
        func - the function to apply
        exec - executor service for execution
      • eachDocWithLabel

        void eachDocWithLabel​(org.nd4j.shade.guava.base.Function<org.nd4j.common.primitives.Pair<List<T>,​String>,​Void> func,
                              Executor exec)
        Iterate over each document with a label
        Parameters:
        func - the function to apply
        exec - executor service for execution
      • eachDoc

        void eachDoc​(org.nd4j.shade.guava.base.Function<List<T>,​Void> func,
                     Executor exec)
        Iterate over each document
        Parameters:
        func - the function to apply
        exec - executor service for execution