Class VocabConstructor<T extends SequenceElement>
- java.lang.Object
-
- org.deeplearning4j.models.word2vec.wordstore.VocabConstructor<T>
-
public class VocabConstructor<T extends SequenceElement> extends Object
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
VocabConstructor.Builder<T extends SequenceElement>
protected class
VocabConstructor.VocabRunnable
-
Field Summary
Fields Modifier and Type Field Description protected static org.slf4j.Logger
log
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected WeightLookupTable<T>
buildExtendedLookupTable()
Placeholder for future implementationprotected VocabCache<T>
buildExtendedVocabulary()
Placeholder for future implementationVocabCache<T>
buildJointVocabulary(boolean resetCounters, boolean buildHuffmanTree)
This method scans all sources passed through builder, and returns all words as vocab.VocabCache<T>
buildMergedVocabulary(@NonNull WordVectors wordVectors, boolean fetchLabels)
This method transfers existing WordVectors model into current oneVocabCache<T>
buildMergedVocabulary(@NonNull VocabCache<T> vocabCache, boolean fetchLabels)
This method transfers existing vocabulary into current one Please note: this method expects source vocabulary has Huffman tree indexes appliedprotected void
filterVocab(AbstractCache<T> cache, int minWordFrequency)
long
getNumberOfSequences()
This method returns total number of sequences passed through VocabConstructorvoid
processDocument(AbstractCache<T> targetVocab, Sequence<T> document, AtomicLong finalCounter, AtomicLong loopCounter)
VocabCache<T>
transferVocabulary(@NonNull VocabCache<T> vocabCache, boolean buildHuffman)
-
-
-
Method Detail
-
buildExtendedLookupTable
protected WeightLookupTable<T> buildExtendedLookupTable()
Placeholder for future implementation- Returns:
-
buildExtendedVocabulary
protected VocabCache<T> buildExtendedVocabulary()
Placeholder for future implementation- Returns:
-
buildMergedVocabulary
public VocabCache<T> buildMergedVocabulary(@NonNull @NonNull WordVectors wordVectors, boolean fetchLabels)
This method transfers existing WordVectors model into current one- Parameters:
wordVectors
-- Returns:
-
getNumberOfSequences
public long getNumberOfSequences()
This method returns total number of sequences passed through VocabConstructor- Returns:
-
buildMergedVocabulary
public VocabCache<T> buildMergedVocabulary(@NonNull @NonNull VocabCache<T> vocabCache, boolean fetchLabels)
This method transfers existing vocabulary into current one Please note: this method expects source vocabulary has Huffman tree indexes applied- Parameters:
vocabCache
-- Returns:
-
transferVocabulary
public VocabCache<T> transferVocabulary(@NonNull @NonNull VocabCache<T> vocabCache, boolean buildHuffman)
-
processDocument
public void processDocument(AbstractCache<T> targetVocab, Sequence<T> document, AtomicLong finalCounter, AtomicLong loopCounter)
-
buildJointVocabulary
public VocabCache<T> buildJointVocabulary(boolean resetCounters, boolean buildHuffmanTree)
This method scans all sources passed through builder, and returns all words as vocab. If TargetVocabCache was set during instance creation, it'll be filled too.- Returns:
-
filterVocab
protected void filterVocab(AbstractCache<T> cache, int minWordFrequency)
-
-