Word2Vec.Builder |
Word2Vec.Builder.allowParallelTokenization(boolean allow) |
This method enables/disables parallel tokenization.
|
Word2Vec.Builder |
Word2Vec.Builder.batchSize(int batchSize) |
This method defines mini-batch size
|
Word2Vec.Builder |
Word2Vec.Builder.elementsLearningAlgorithm(@NonNull String algorithm) |
|
Word2Vec.Builder |
Word2Vec.Builder.elementsLearningAlgorithm(@NonNull ElementsLearningAlgorithm<VocabWord> algorithm) |
|
Word2Vec.Builder |
Word2Vec.Builder.enableScavenger(boolean reallyEnable) |
This method ebables/disables periodical vocab truncation during construction
Default value: disabled
|
Word2Vec.Builder |
Word2Vec.Builder.epochs(int numEpochs) |
This method defines number of epochs (iterations over whole training corpus) for training
|
Word2Vec.Builder |
Word2Vec.Builder.intersectModel(@NonNull SequenceVectors vectors,
boolean isLocked) |
|
Word2Vec.Builder |
Word2Vec.Builder.iterate(@NonNull SequenceIterator<VocabWord> iterator) |
This method used to feed SequenceIterator, that contains training corpus, into ParagraphVectors
|
Word2Vec.Builder |
Word2Vec.Builder.iterate(@NonNull DocumentIterator iterator) |
|
Word2Vec.Builder |
Word2Vec.Builder.iterate(@NonNull LabelAwareIterator iterator) |
This method used to feed LabelAwareIterator, that is usually used
|
Word2Vec.Builder |
Word2Vec.Builder.iterate(@NonNull SentenceIterator iterator) |
This method used to feed SentenceIterator, that contains training corpus, into ParagraphVectors
|
Word2Vec.Builder |
Word2Vec.Builder.iterations(int iterations) |
This method defines number of iterations done for each mini-batch during training
|
Word2Vec.Builder |
Word2Vec.Builder.layerSize(int layerSize) |
This method defines number of dimensions for output vectors
|
Word2Vec.Builder |
Word2Vec.Builder.learningRate(double learningRate) |
This method defines initial learning rate for model training
|
Word2Vec.Builder |
Word2Vec.Builder.limitVocabularySize(int limit) |
This method sets vocabulary limit during construction.
|
Word2Vec.Builder |
Word2Vec.Builder.lookupTable(@NonNull WeightLookupTable<VocabWord> lookupTable) |
This method allows to define external WeightLookupTable to be used
|
Word2Vec.Builder |
Word2Vec.Builder.minLearningRate(double minLearningRate) |
This method defines minimal learning rate value for training
|
Word2Vec.Builder |
Word2Vec.Builder.minWordFrequency(int minWordFrequency) |
This method defines minimal word frequency in training corpus.
|
Word2Vec.Builder |
Word2Vec.Builder.modelUtils(@NonNull ModelUtils<VocabWord> modelUtils) |
Sets ModelUtils that gonna be used as provider for utility methods: similarity(), wordsNearest(), accuracy(), etc
|
Word2Vec.Builder |
Word2Vec.Builder.negativeSample(double negative) |
This method defines whether negative sampling should be used or not
PLEASE NOTE: If you're going to use negative sampling, you might want to disable HierarchicSoftmax, which is enabled by default
Default value: 0
|
Word2Vec.Builder |
Word2Vec.Builder.resetModel(boolean reallyReset) |
This method defines whether model should be totally wiped out prior building, or not
|
Word2Vec.Builder |
Word2Vec.Builder.sampling(double sampling) |
This method defines whether subsampling should be used or not
|
Word2Vec.Builder |
Word2Vec.Builder.seed(long randomSeed) |
This method defines random seed for random numbers generator
|
Word2Vec.Builder |
Word2Vec.Builder.setVectorsListeners(@NonNull Collection<VectorsListener<VocabWord>> vectorsListeners) |
This method sets VectorsListeners for this SequenceVectors model
|
Word2Vec.Builder |
Word2Vec.Builder.stopWords(@NonNull Collection<VocabWord> stopList) |
This method defines stop words that should be ignored during training
|
Word2Vec.Builder |
Word2Vec.Builder.stopWords(@NonNull List<String> stopList) |
This method defines stop words that should be ignored during training
|
Word2Vec.Builder |
Word2Vec.Builder.tokenizerFactory(@NonNull TokenizerFactory tokenizerFactory) |
This method defines TokenizerFactory to be used for strings tokenization during training
PLEASE NOTE: If external VocabCache is used, the same TokenizerFactory should be used to keep derived tokens equal.
|
Word2Vec.Builder |
Word2Vec.Builder.trainElementsRepresentation(boolean trainElements) |
This method is hardcoded to TRUE, since that's whole point of Word2Vec
|
Word2Vec.Builder |
Word2Vec.Builder.trainSequencesRepresentation(boolean trainSequences) |
This method is hardcoded to FALSE, since that's whole point of Word2Vec
|
Word2Vec.Builder |
Word2Vec.Builder.unknownElement(VocabWord element) |
This method allows you to specify SequenceElement that will be used as UNK element, if UNK is used
|
Word2Vec.Builder |
Word2Vec.Builder.useAdaGrad(boolean reallyUse) |
This method defines whether adaptive gradients should be used or not
|
protected Word2Vec.Builder |
Word2Vec.Builder.useExistingWordVectors(@NonNull WordVectors vec) |
This method has no effect for Word2Vec
|
Word2Vec.Builder |
Word2Vec.Builder.useHierarchicSoftmax(boolean reallyUse) |
This method enables/disables Hierarchic softmax
Default value: enabled
|
Word2Vec.Builder |
Word2Vec.Builder.usePreciseMode(boolean reallyUse) |
|
Word2Vec.Builder |
Word2Vec.Builder.usePreciseWeightInit(boolean reallyUse) |
|
Word2Vec.Builder |
Word2Vec.Builder.useUnknown(boolean reallyUse) |
This method allows you to specify, if UNK word should be used internally
|
Word2Vec.Builder |
Word2Vec.Builder.useVariableWindow(int... windows) |
This method allows to use variable window size.
|
Word2Vec.Builder |
Word2Vec.Builder.vocabCache(@NonNull VocabCache<VocabWord> vocabCache) |
This method allows to define external VocabCache to be used
|
Word2Vec.Builder |
Word2Vec.Builder.windowSize(int windowSize) |
This method defines context window size
|
Word2Vec.Builder |
Word2Vec.Builder.workers(int numWorkers) |
This method defines maximum number of concurrent threads available for training
|