Interface InvertedIndex<T extends SequenceElement>
-
- All Superinterfaces:
Serializable
public interface InvertedIndex<T extends SequenceElement> extends Serializable
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description void
addLabelForDoc(int doc, String label)
Adds words to the given documentvoid
addLabelForDoc(int doc, T word)
Add word to a documentvoid
addLabelsForDoc(int doc, Collection<String> label)
Adds words to the given documentvoid
addLabelsForDoc(int doc, List<T> word)
Add word to a documentvoid
addWordsToDoc(int doc, List<T> words)
Adds words to the given documentvoid
addWordsToDoc(int doc, List<T> words, String label)
Adds words to the given documentvoid
addWordsToDoc(int doc, List<T> words, Collection<String> label)
Adds words to the given documentvoid
addWordsToDoc(int doc, List<T> words, T label)
Adds words to the given documentvoid
addWordsToDocVocabWord(int doc, List<T> words, Collection<T> label)
Adds words to the given documentvoid
addWordToDoc(int doc, T word)
Add word to a documentint[]
allDocs()
Returns a list of all documentsIterator<List<List<T>>>
batchIter(int batchSize)
Iterate over batchesint
batchSize()
For word vectors, this is the batch size for which to train onvoid
cleanup()
Cleanup any resources usedIterator<List<T>>
docs()
Iterate over documentsList<T>
document(int index)
Returns a list of words for a documentint[]
documents(T vocabWord)
Returns the list of documents a vocab word is inorg.nd4j.common.primitives.Pair<List<T>,String>
documentWithLabel(int index)
Returns a list of words for a document and the associated labelorg.nd4j.common.primitives.Pair<List<T>,Collection<String>>
documentWithLabels(int index)
Returns a list of words associated with the document and the associated labelsvoid
eachDoc(org.nd4j.shade.guava.base.Function<List<T>,Void> func, Executor exec)
Iterate over each documentvoid
eachDocWithLabel(org.nd4j.shade.guava.base.Function<org.nd4j.common.primitives.Pair<List<T>,String>,Void> func, Executor exec)
Iterate over each document with a labelvoid
eachDocWithLabels(org.nd4j.shade.guava.base.Function<org.nd4j.common.primitives.Pair<List<T>,Collection<String>>,Void> func, Executor exec)
Iterate over each document with a labelvoid
finish()
Finishes saving dataIterator<List<T>>
miniBatches()
Iterates over mini batchesint
numDocuments()
Returns the number of documentsdouble
sample()
Sampling for creating mini batcheslong
totalWords()
Total number of words in the indexvoid
unlock()
Unlock the index
-
-
-
Method Detail
-
batchIter
Iterator<List<List<T>>> batchIter(int batchSize)
Iterate over batches- Returns:
- the batch size
-
unlock
void unlock()
Unlock the index
-
cleanup
void cleanup()
Cleanup any resources used
-
sample
double sample()
Sampling for creating mini batches- Returns:
- the sampling for mini batches
-
miniBatches
Iterator<List<T>> miniBatches()
Iterates over mini batches- Returns:
- the mini batches created by this vectorizer
-
document
List<T> document(int index)
Returns a list of words for a document- Parameters:
index
-- Returns:
-
documentWithLabel
org.nd4j.common.primitives.Pair<List<T>,String> documentWithLabel(int index)
Returns a list of words for a document and the associated label- Parameters:
index
-- Returns:
-
documentWithLabels
org.nd4j.common.primitives.Pair<List<T>,Collection<String>> documentWithLabels(int index)
Returns a list of words associated with the document and the associated labels- Parameters:
index
-- Returns:
-
documents
int[] documents(T vocabWord)
Returns the list of documents a vocab word is in- Parameters:
vocabWord
- the vocab word to get documents for- Returns:
- the documents for a vocab word
-
numDocuments
int numDocuments()
Returns the number of documents- Returns:
-
allDocs
int[] allDocs()
Returns a list of all documents- Returns:
- the list of all documents
-
addWordToDoc
void addWordToDoc(int doc, T word)
Add word to a document- Parameters:
doc
- the document to add toword
- the word to add
-
addWordsToDoc
void addWordsToDoc(int doc, List<T> words)
Adds words to the given document- Parameters:
doc
- the document to add towords
- the words to add
-
addLabelForDoc
void addLabelForDoc(int doc, T word)
Add word to a document- Parameters:
doc
- the document to add toword
- the word to add
-
addLabelForDoc
void addLabelForDoc(int doc, String label)
Adds words to the given document- Parameters:
doc
- the document to add to
-
addWordsToDoc
void addWordsToDoc(int doc, List<T> words, String label)
Adds words to the given document- Parameters:
doc
- the document to add towords
- the words to addlabel
- the label for the document
-
addWordsToDoc
void addWordsToDoc(int doc, List<T> words, T label)
Adds words to the given document- Parameters:
doc
- the document to add towords
- the words to addlabel
- the label for the document
-
addLabelsForDoc
void addLabelsForDoc(int doc, List<T> word)
Add word to a document- Parameters:
doc
- the document to add toword
- the word to add
-
addLabelsForDoc
void addLabelsForDoc(int doc, Collection<String> label)
Adds words to the given document- Parameters:
doc
- the document to add tolabel
- the labels to add
-
addWordsToDoc
void addWordsToDoc(int doc, List<T> words, Collection<String> label)
Adds words to the given document- Parameters:
doc
- the document to add towords
- the words to addlabel
- the label for the document
-
addWordsToDocVocabWord
void addWordsToDocVocabWord(int doc, List<T> words, Collection<T> label)
Adds words to the given document- Parameters:
doc
- the document to add towords
- the words to addlabel
- the label for the document
-
finish
void finish()
Finishes saving data
-
totalWords
long totalWords()
Total number of words in the index- Returns:
- the total number of words in the index
-
batchSize
int batchSize()
For word vectors, this is the batch size for which to train on- Returns:
- the batch size for which to train on
-
eachDocWithLabels
void eachDocWithLabels(org.nd4j.shade.guava.base.Function<org.nd4j.common.primitives.Pair<List<T>,Collection<String>>,Void> func, Executor exec)
Iterate over each document with a label- Parameters:
func
- the function to applyexec
- executor service for execution
-
eachDocWithLabel
void eachDocWithLabel(org.nd4j.shade.guava.base.Function<org.nd4j.common.primitives.Pair<List<T>,String>,Void> func, Executor exec)
Iterate over each document with a label- Parameters:
func
- the function to applyexec
- executor service for execution
-
-