public interface InvertedIndex extends Serializable
Modifier and Type | Method and Description |
---|---|
void |
addWordsToDoc(int doc,
List<VocabWord> words)
Adds words to the given document
|
void |
addWordToDoc(int doc,
VocabWord word)
Add word to a document
|
int[] |
allDocs()
Returns a list of all documents
|
int |
batchSize()
For word vectors, this is the batch size for which to train on
|
List<VocabWord> |
document(int index)
Returns a list of words for a document
|
int[] |
documents(VocabWord vocabWord)
Returns the list of documents a vocab word is in
|
void |
eachDoc(com.google.common.base.Function<List<VocabWord>,Void> func,
ExecutorService exec)
Iterate over each document
|
void |
finish()
Finishes saving data
|
Iterator<List<VocabWord>> |
miniBatches()
Iterates over mini batches
|
int |
numDocuments()
Returns the number of documents
|
double |
sample()
Sampling for creating mini batches
|
int |
totalWords()
Total number of words in the index
|
double sample()
Iterator<List<VocabWord>> miniBatches()
List<VocabWord> document(int index)
index
- int[] documents(VocabWord vocabWord)
vocabWord
- the vocab word to get documents forint numDocuments()
int[] allDocs()
void addWordToDoc(int doc, VocabWord word)
doc
- the document to add toword
- the word to addvoid addWordsToDoc(int doc, List<VocabWord> words)
doc
- the document to add towords
- the words to addvoid finish()
int totalWords()
int batchSize()
void eachDoc(com.google.common.base.Function<List<VocabWord>,Void> func, ExecutorService exec)
func
- the function to applyexec
- exectuor service for executionCopyright © 2014. All rights reserved.