Class BaseTextVectorizer
- java.lang.Object
-
- org.deeplearning4j.bagofwords.vectorizer.BaseTextVectorizer
-
- All Implemented Interfaces:
Serializable,TextVectorizer,Vectorizer
- Direct Known Subclasses:
BagOfWordsVectorizer,TfidfVectorizer
public abstract class BaseTextVectorizer extends Object implements TextVectorizer
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected InvertedIndex<VocabWord>indexprotected booleanisParallelprotected LabelAwareIteratoriteratorprotected LabelsSourcelabelsSourceprotected intminWordFrequencyprotected Collection<String>stopWordsprotected TokenizerFactorytokenizerFactoryprotected VocabCache<VocabWord>vocabCache
-
Constructor Summary
Constructors Constructor Description BaseTextVectorizer()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidbuildVocab()voidfit()Train the modelLabelsSourcegetLabelsSource()longnumWordsEncountered()Returns the number of words encountered so far-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.deeplearning4j.bagofwords.vectorizer.TextVectorizer
getIndex, getVocabCache, transform, transform, vectorize, vectorize, vectorize
-
Methods inherited from interface org.deeplearning4j.core.datasets.vectorizer.Vectorizer
vectorize
-
-
-
-
Field Detail
-
tokenizerFactory
protected transient TokenizerFactory tokenizerFactory
-
iterator
protected transient LabelAwareIterator iterator
-
minWordFrequency
protected int minWordFrequency
-
vocabCache
protected VocabCache<VocabWord> vocabCache
-
labelsSource
protected LabelsSource labelsSource
-
stopWords
protected Collection<String> stopWords
-
index
protected transient InvertedIndex<VocabWord> index
-
isParallel
protected boolean isParallel
-
-
Method Detail
-
getLabelsSource
public LabelsSource getLabelsSource()
-
buildVocab
public void buildVocab()
-
fit
public void fit()
Description copied from interface:TextVectorizerTrain the model- Specified by:
fitin interfaceTextVectorizer
-
numWordsEncountered
public long numWordsEncountered()
Returns the number of words encountered so far- Specified by:
numWordsEncounteredin interfaceTextVectorizer- Returns:
- the number of words encountered so far
-
-