public interface TextVectorizer extends Vectorizer
| Modifier and Type | Method and Description |
|---|---|
void |
fit()
Train the model
|
InvertedIndex |
index()
Inverted index
|
int |
numWordsEncountered()
Returns the number of words encountered so far
|
org.nd4j.linalg.api.ndarray.INDArray |
transform(String text)
Transforms the matrix
|
org.nd4j.linalg.dataset.DataSet |
vectorize(File input,
String label) |
org.nd4j.linalg.dataset.DataSet |
vectorize(InputStream is,
String label)
Text coming from an input stream considered as one document
|
org.nd4j.linalg.dataset.DataSet |
vectorize(String text,
String label)
Vectorizes the passed in text treating it as one document
|
VocabCache |
vocab()
The vocab sorted in descending order
|
vectorizeVocabCache vocab()
org.nd4j.linalg.dataset.DataSet vectorize(InputStream is, String label)
is - the input stream to read fromlabel - the label to assignorg.nd4j.linalg.dataset.DataSet vectorize(String text, String label)
text - the text to vectorizelabel - the label of the textvoid fit()
org.nd4j.linalg.dataset.DataSet vectorize(File input, String label)
input - the text to vectorizelabel - the label of the textorg.nd4j.linalg.api.ndarray.INDArray transform(String text)
text - int numWordsEncountered()
InvertedIndex index()
Copyright © 2014. All rights reserved.