public static class SequenceVectors.Builder<T extends SequenceElement> extends Object
Constructor and Description |
---|
Builder() |
Builder(VectorsConfiguration configuration) |
Modifier and Type | Method and Description |
---|---|
SequenceVectors.Builder<T> |
batchSize(int batchSize)
This method defines batchSize option, viable only if iterations > 1
|
SequenceVectors<T> |
build()
Build SequenceVectors instance with defined settings/options
|
SequenceVectors.Builder<T> |
elementsLearningAlgorithm(ElementsLearningAlgorithm<T> algorithm)
* Sets specific LearningAlgorithm as Elements Learning Algorithm
|
SequenceVectors.Builder<T> |
elementsLearningAlgorithm(String algoName)
* Sets specific LearningAlgorithm as Elements Learning Algorithm
|
SequenceVectors.Builder<T> |
enableScavenger(boolean reallyEnable)
This method ebables/disables periodical vocab truncation during construction
Default value: disabled
|
SequenceVectors.Builder<T> |
epochs(int numEpochs)
This method defines how much iterations should be done over whole training corpus during modelling
|
SequenceVectors.Builder<T> |
iterate(SequenceIterator<T> iterator)
This method defines SequenceIterator to be used for model building
|
SequenceVectors.Builder<T> |
iterations(int iterations)
This method defines how much iterations should be done over batched sequences.
|
SequenceVectors.Builder<T> |
layerSize(int layerSize)
This method defines number of dimensions for outcome vectors.
|
SequenceVectors.Builder<T> |
learningRate(double learningRate)
This method defines initial learning rate.
|
SequenceVectors.Builder |
limitVocabularySize(int limit)
This method sets vocabulary limit during construction.
|
SequenceVectors.Builder<T> |
lookupTable(WeightLookupTable<T> lookupTable)
You can pass externally built WeightLookupTable, containing model weights and vocabulary.
|
SequenceVectors.Builder<T> |
minLearningRate(double minLearningRate)
This method defines minimum learning rate after decay being applied.
|
SequenceVectors.Builder<T> |
minWordFrequency(int minWordFrequency)
This method defines minimal element frequency for elements found in the training corpus.
|
SequenceVectors.Builder<T> |
modelUtils(ModelUtils<T> modelUtils)
ModelUtils implementation, that will be used to access model.
|
SequenceVectors.Builder<T> |
negativeSample(double negative)
This method defines negative sampling value for skip-gram algorithm.
|
protected void |
presetTables()
This method creates new WeightLookupTable
|
SequenceVectors.Builder<T> |
resetModel(boolean reallyReset)
This method defines, should all model be reset before training.
|
SequenceVectors.Builder<T> |
sampling(double sampling)
This method defines sub-sampling threshold.
|
SequenceVectors.Builder<T> |
seed(long randomSeed)
Sets seed for random numbers generator.
|
SequenceVectors.Builder<T> |
sequenceLearningAlgorithm(SequenceLearningAlgorithm<T> algorithm)
Sets specific LearningAlgorithm as Sequence Learning Algorithm
|
SequenceVectors.Builder<T> |
sequenceLearningAlgorithm(String algoName)
Sets specific LearningAlgorithm as Sequence Learning Algorithm
|
SequenceVectors.Builder<T> |
setVectorsListeners(Collection<VectorsListener<T>> listeners)
This method sets VectorsListeners for this SequenceVectors model
|
SequenceVectors.Builder<T> |
stopWords(Collection<T> stopList)
You can provide collection of objects to be ignored, and excluded out of model
Please note: Object labels and hashCode will be used for filtering
|
SequenceVectors.Builder<T> |
stopWords(List<String> stopList)
You can provide collection of objects to be ignored, and excluded out of model
Please note: Object labels and hashCode will be used for filtering
|
SequenceVectors.Builder<T> |
trainElementsRepresentation(boolean trainElements) |
SequenceVectors.Builder<T> |
trainSequencesRepresentation(boolean trainSequences) |
SequenceVectors.Builder<T> |
unknownElement(T element)
This method allows you to specify SequenceElement that will be used as UNK element, if UNK is used
|
SequenceVectors.Builder<T> |
useAdaGrad(boolean reallyUse)
Deprecated.
|
protected SequenceVectors.Builder<T> |
useExistingWordVectors(WordVectors vec)
This method allows you to use pre-built WordVectors model (SkipGram or GloVe) for DBOW sequence learning.
|
SequenceVectors.Builder<T> |
useHierarchicSoftmax(boolean reallyUse)
Enable/disable hierarchic softmax
|
SequenceVectors.Builder<T> |
usePreciseWeightInit(boolean reallyUse)
If set to true, initial weights for elements/sequences will be derived from elements themself.
|
SequenceVectors.Builder<T> |
useUnknown(boolean reallyUse)
This method allows you to specify, if UNK word should be used internally
|
SequenceVectors.Builder<T> |
useVariableWindow(int... windows)
This method allows to use variable window size.
|
SequenceVectors.Builder<T> |
vocabCache(VocabCache<T> vocabCache)
You can pass externally built vocabCache object, containing vocabulary
|
SequenceVectors.Builder<T> |
windowSize(int windowSize)
Sets window size for skip-Gram training
|
SequenceVectors.Builder<T> |
workers(int numWorkers)
Sets number of worker threads to be used in calculations
|
protected VocabCache<T extends SequenceElement> vocabCache
protected WeightLookupTable<T extends SequenceElement> lookupTable
protected SequenceIterator<T extends SequenceElement> iterator
protected ModelUtils<T extends SequenceElement> modelUtils
protected WordVectors existingVectors
protected double sampling
protected double negative
protected double learningRate
protected double minLearningRate
protected int minWordFrequency
protected int iterations
protected int numEpochs
protected int layerSize
protected int window
protected boolean hugeModelExpected
protected int batchSize
protected int learningRateDecayWords
protected long seed
protected boolean useAdaGrad
protected boolean resetModel
protected int workers
protected boolean useUnknown
protected boolean useHierarchicSoftmax
protected int[] variableWindows
protected boolean trainSequenceVectors
protected boolean trainElementsVectors
protected boolean preciseWeightInit
protected Collection<String> stopWords
protected VectorsConfiguration configuration
protected transient T extends SequenceElement unknownElement
protected String UNK
protected String STOP
protected boolean enableScavenger
protected int vocabLimit
protected ElementsLearningAlgorithm<T extends SequenceElement> elementsLearningAlgorithm
protected SequenceLearningAlgorithm<T extends SequenceElement> sequenceLearningAlgorithm
protected Set<VectorsListener<T extends SequenceElement>> vectorsListeners
public Builder()
public Builder(@NonNull VectorsConfiguration configuration)
protected SequenceVectors.Builder<T> useExistingWordVectors(@NonNull WordVectors vec)
vec
- existing WordVectors modelpublic SequenceVectors.Builder<T> iterate(@NonNull SequenceIterator<T> iterator)
iterator
- public SequenceVectors.Builder<T> sequenceLearningAlgorithm(@NonNull String algoName)
algoName
- fully qualified class namepublic SequenceVectors.Builder<T> sequenceLearningAlgorithm(@NonNull SequenceLearningAlgorithm<T> algorithm)
algorithm
- SequenceLearningAlgorithm implementationpublic SequenceVectors.Builder<T> elementsLearningAlgorithm(@NonNull String algoName)
algoName
- fully qualified class namepublic SequenceVectors.Builder<T> elementsLearningAlgorithm(@NonNull ElementsLearningAlgorithm<T> algorithm)
algorithm
- ElementsLearningAlgorithm implementationpublic SequenceVectors.Builder<T> batchSize(int batchSize)
batchSize
- public SequenceVectors.Builder<T> iterations(int iterations)
iterations
- public SequenceVectors.Builder<T> epochs(int numEpochs)
numEpochs
- public SequenceVectors.Builder<T> workers(int numWorkers)
numWorkers
- public SequenceVectors.Builder<T> useHierarchicSoftmax(boolean reallyUse)
reallyUse
- @Deprecated public SequenceVectors.Builder<T> useAdaGrad(boolean reallyUse)
reallyUse
- public SequenceVectors.Builder<T> layerSize(int layerSize)
layerSize
- public SequenceVectors.Builder<T> learningRate(double learningRate)
learningRate
- public SequenceVectors.Builder<T> minWordFrequency(int minWordFrequency)
minWordFrequency
- public SequenceVectors.Builder limitVocabularySize(int limit)
limit
- public SequenceVectors.Builder<T> minLearningRate(double minLearningRate)
minLearningRate
- public SequenceVectors.Builder<T> resetModel(boolean reallyReset)
reallyReset
- public SequenceVectors.Builder<T> vocabCache(@NonNull VocabCache<T> vocabCache)
vocabCache
- public SequenceVectors.Builder<T> lookupTable(@NonNull WeightLookupTable<T> lookupTable)
lookupTable
- public SequenceVectors.Builder<T> sampling(double sampling)
sampling
- public SequenceVectors.Builder<T> negativeSample(double negative)
negative
- public SequenceVectors.Builder<T> stopWords(@NonNull List<String> stopList)
stopList
- public SequenceVectors.Builder<T> trainElementsRepresentation(boolean trainElements)
trainElements
- public SequenceVectors.Builder<T> trainSequencesRepresentation(boolean trainSequences)
public SequenceVectors.Builder<T> stopWords(@NonNull Collection<T> stopList)
stopList
- public SequenceVectors.Builder<T> windowSize(int windowSize)
windowSize
- public SequenceVectors.Builder<T> seed(long randomSeed)
randomSeed
- public SequenceVectors.Builder<T> modelUtils(@NonNull ModelUtils<T> modelUtils)
modelUtils
- model utils to be usedpublic SequenceVectors.Builder<T> useUnknown(boolean reallyUse)
reallyUse
- public SequenceVectors.Builder<T> unknownElement(@NonNull T element)
element
- public SequenceVectors.Builder<T> useVariableWindow(int... windows)
windows
- public SequenceVectors.Builder<T> usePreciseWeightInit(boolean reallyUse)
reallyUse
- protected void presetTables()
public SequenceVectors.Builder<T> setVectorsListeners(@NonNull Collection<VectorsListener<T>> listeners)
listeners
- public SequenceVectors.Builder<T> enableScavenger(boolean reallyEnable)
reallyEnable
- public SequenceVectors<T> build()
Copyright © 2018. All rights reserved.