public static class SequenceVectors.Builder<T extends SequenceElement> extends Object
Modifier and Type | Field and Description |
---|---|
protected int |
batchSize |
protected VectorsConfiguration |
configuration |
protected ElementsLearningAlgorithm<T> |
elementsLearningAlgorithm |
protected boolean |
enableScavenger |
protected WordVectors |
existingVectors |
protected boolean |
hugeModelExpected |
protected SequenceVectors<T> |
intersectVectors |
protected int |
iterations |
protected SequenceIterator<T> |
iterator |
protected int |
layerSize |
protected double |
learningRate |
protected int |
learningRateDecayWords |
protected boolean |
lockFactor |
protected WeightLookupTable<T> |
lookupTable |
protected double |
minLearningRate |
protected int |
minWordFrequency |
protected ModelUtils<T> |
modelUtils |
protected double |
negative |
protected int |
numEpochs |
protected boolean |
preciseMode
Experimental field.
|
protected boolean |
preciseWeightInit |
protected boolean |
resetModel |
protected double |
sampling |
protected long |
seed |
protected SequenceLearningAlgorithm<T> |
sequenceLearningAlgorithm |
protected String |
STOP |
protected Collection<String> |
stopWords |
protected boolean |
trainElementsVectors |
protected boolean |
trainSequenceVectors |
protected String |
UNK |
protected T |
unknownElement |
protected boolean |
useAdaGrad |
protected boolean |
useHierarchicSoftmax |
protected boolean |
useUnknown |
protected int[] |
variableWindows |
protected Set<VectorsListener<T>> |
vectorsListeners |
protected VocabCache<T> |
vocabCache |
protected int |
vocabLimit |
protected int |
window |
protected int |
workers |
Constructor and Description |
---|
Builder() |
Builder(@NonNull VectorsConfiguration configuration) |
Modifier and Type | Method and Description |
---|---|
SequenceVectors.Builder<T> |
batchSize(int batchSize)
This method defines batchSize option, viable only if iterations > 1
|
SequenceVectors<T> |
build()
Build SequenceVectors instance with defined settings/options
|
SequenceVectors.Builder<T> |
elementsLearningAlgorithm(@NonNull ElementsLearningAlgorithm<T> algorithm)
* Sets specific LearningAlgorithm as Elements Learning Algorithm
|
SequenceVectors.Builder<T> |
elementsLearningAlgorithm(@NonNull String algoName)
* Sets specific LearningAlgorithm as Elements Learning Algorithm
|
SequenceVectors.Builder<T> |
enableScavenger(boolean reallyEnable)
This method ebables/disables periodical vocab truncation during construction
Default value: disabled
|
SequenceVectors.Builder<T> |
epochs(int numEpochs)
This method defines how much iterations should be done over whole training corpus during modelling
|
SequenceVectors.Builder<T> |
intersectModel(@NonNull SequenceVectors<T> intersectVectors,
boolean lockFactor) |
SequenceVectors.Builder<T> |
iterate(@NonNull SequenceIterator<T> iterator)
This method defines SequenceIterator to be used for model building
|
SequenceVectors.Builder<T> |
iterations(int iterations)
This method defines how much iterations should be done over batched sequences.
|
SequenceVectors.Builder<T> |
layerSize(int layerSize)
This method defines number of dimensions for outcome vectors.
|
SequenceVectors.Builder<T> |
learningRate(double learningRate)
This method defines initial learning rate.
|
SequenceVectors.Builder |
limitVocabularySize(int limit)
This method sets vocabulary limit during construction.
|
SequenceVectors.Builder<T> |
lookupTable(@NonNull WeightLookupTable<T> lookupTable)
You can pass externally built WeightLookupTable, containing model weights and vocabulary.
|
SequenceVectors.Builder<T> |
minLearningRate(double minLearningRate)
This method defines minimum learning rate after decay being applied.
|
SequenceVectors.Builder<T> |
minWordFrequency(int minWordFrequency)
This method defines minimal element frequency for elements found in the training corpus.
|
SequenceVectors.Builder<T> |
modelUtils(@NonNull ModelUtils<T> modelUtils)
ModelUtils implementation, that will be used to access model.
|
SequenceVectors.Builder<T> |
negativeSample(double negative)
This method defines negative sampling value for skip-gram algorithm.
|
protected void |
presetTables()
This method creates new WeightLookupTable
|
SequenceVectors.Builder<T> |
resetModel(boolean reallyReset)
This method defines, should all model be reset before training.
|
SequenceVectors.Builder<T> |
sampling(double sampling)
This method defines sub-sampling threshold.
|
SequenceVectors.Builder<T> |
seed(long randomSeed)
Sets seed for random numbers generator.
|
SequenceVectors.Builder<T> |
sequenceLearningAlgorithm(@NonNull SequenceLearningAlgorithm<T> algorithm)
Sets specific LearningAlgorithm as Sequence Learning Algorithm
|
SequenceVectors.Builder<T> |
sequenceLearningAlgorithm(@NonNull String algoName)
Sets specific LearningAlgorithm as Sequence Learning Algorithm
|
SequenceVectors.Builder<T> |
setVectorsListeners(@NonNull Collection<VectorsListener<T>> listeners)
This method sets VectorsListeners for this SequenceVectors model
|
SequenceVectors.Builder<T> |
stopWords(@NonNull Collection<T> stopList)
You can provide collection of objects to be ignored, and excluded out of model
Please note: Object labels and hashCode will be used for filtering
|
SequenceVectors.Builder<T> |
stopWords(@NonNull List<String> stopList)
You can provide collection of objects to be ignored, and excluded out of model
Please note: Object labels and hashCode will be used for filtering
|
SequenceVectors.Builder<T> |
trainElementsRepresentation(boolean trainElements) |
SequenceVectors.Builder<T> |
trainSequencesRepresentation(boolean trainSequences) |
SequenceVectors.Builder<T> |
unknownElement(T element)
This method allows you to specify SequenceElement that will be used as UNK element, if UNK is used
|
SequenceVectors.Builder<T> |
useAdaGrad(boolean reallyUse)
Deprecated.
|
protected SequenceVectors.Builder<T> |
useExistingWordVectors(@NonNull WordVectors vec)
This method allows you to use pre-built WordVectors model (e.g.
|
SequenceVectors.Builder<T> |
useHierarchicSoftmax(boolean reallyUse)
Enable/disable hierarchic softmax
|
SequenceVectors.Builder<T> |
usePreciseMode(boolean reallyUse) |
SequenceVectors.Builder<T> |
usePreciseWeightInit(boolean reallyUse)
If set to true, initial weights for elements/sequences will be derived from elements themself.
|
SequenceVectors.Builder<T> |
useUnknown(boolean reallyUse)
This method allows you to specify, if UNK word should be used internally
|
SequenceVectors.Builder<T> |
useVariableWindow(int... windows)
This method allows to use variable window size.
|
SequenceVectors.Builder<T> |
vocabCache(@NonNull VocabCache<T> vocabCache)
You can pass externally built vocabCache object, containing vocabulary
|
SequenceVectors.Builder<T> |
windowSize(int windowSize)
Sets window size for skip-Gram training
|
SequenceVectors.Builder<T> |
workers(int numWorkers)
Sets number of worker threads to be used in calculations
|
protected VocabCache<T extends SequenceElement> vocabCache
protected WeightLookupTable<T extends SequenceElement> lookupTable
protected SequenceIterator<T extends SequenceElement> iterator
protected ModelUtils<T extends SequenceElement> modelUtils
protected WordVectors existingVectors
protected SequenceVectors<T extends SequenceElement> intersectVectors
protected boolean lockFactor
protected double sampling
protected double negative
protected double learningRate
protected double minLearningRate
protected int minWordFrequency
protected int iterations
protected int numEpochs
protected int layerSize
protected int window
protected boolean hugeModelExpected
protected int batchSize
protected int learningRateDecayWords
protected long seed
protected boolean useAdaGrad
protected boolean resetModel
protected int workers
protected boolean useUnknown
protected boolean useHierarchicSoftmax
protected int[] variableWindows
protected boolean trainSequenceVectors
protected boolean trainElementsVectors
protected boolean preciseWeightInit
protected Collection<String> stopWords
protected VectorsConfiguration configuration
protected transient T extends SequenceElement unknownElement
protected String UNK
protected String STOP
protected boolean enableScavenger
protected int vocabLimit
protected boolean preciseMode
protected ElementsLearningAlgorithm<T extends SequenceElement> elementsLearningAlgorithm
protected SequenceLearningAlgorithm<T extends SequenceElement> sequenceLearningAlgorithm
protected Set<VectorsListener<T extends SequenceElement>> vectorsListeners
public Builder()
public Builder(@NonNull @NonNull VectorsConfiguration configuration)
protected SequenceVectors.Builder<T> useExistingWordVectors(@NonNull @NonNull WordVectors vec)
vec
- existing WordVectors modelpublic SequenceVectors.Builder<T> iterate(@NonNull @NonNull SequenceIterator<T> iterator)
iterator
- public SequenceVectors.Builder<T> sequenceLearningAlgorithm(@NonNull @NonNull String algoName)
algoName
- fully qualified class namepublic SequenceVectors.Builder<T> sequenceLearningAlgorithm(@NonNull @NonNull SequenceLearningAlgorithm<T> algorithm)
algorithm
- SequenceLearningAlgorithm implementationpublic SequenceVectors.Builder<T> elementsLearningAlgorithm(@NonNull @NonNull String algoName)
algoName
- fully qualified class namepublic SequenceVectors.Builder<T> elementsLearningAlgorithm(@NonNull @NonNull ElementsLearningAlgorithm<T> algorithm)
algorithm
- ElementsLearningAlgorithm implementationpublic SequenceVectors.Builder<T> batchSize(int batchSize)
batchSize
- public SequenceVectors.Builder<T> iterations(int iterations)
iterations
- public SequenceVectors.Builder<T> epochs(int numEpochs)
numEpochs
- public SequenceVectors.Builder<T> workers(int numWorkers)
numWorkers
- public SequenceVectors.Builder<T> useHierarchicSoftmax(boolean reallyUse)
reallyUse
- @Deprecated public SequenceVectors.Builder<T> useAdaGrad(boolean reallyUse)
reallyUse
- public SequenceVectors.Builder<T> layerSize(int layerSize)
layerSize
- public SequenceVectors.Builder<T> learningRate(double learningRate)
learningRate
- public SequenceVectors.Builder<T> minWordFrequency(int minWordFrequency)
minWordFrequency
- public SequenceVectors.Builder limitVocabularySize(int limit)
limit
- public SequenceVectors.Builder<T> minLearningRate(double minLearningRate)
minLearningRate
- public SequenceVectors.Builder<T> resetModel(boolean reallyReset)
reallyReset
- public SequenceVectors.Builder<T> vocabCache(@NonNull @NonNull VocabCache<T> vocabCache)
vocabCache
- public SequenceVectors.Builder<T> lookupTable(@NonNull @NonNull WeightLookupTable<T> lookupTable)
lookupTable
- public SequenceVectors.Builder<T> sampling(double sampling)
sampling
- public SequenceVectors.Builder<T> negativeSample(double negative)
negative
- public SequenceVectors.Builder<T> stopWords(@NonNull @NonNull List<String> stopList)
stopList
- public SequenceVectors.Builder<T> trainElementsRepresentation(boolean trainElements)
trainElements
- public SequenceVectors.Builder<T> trainSequencesRepresentation(boolean trainSequences)
public SequenceVectors.Builder<T> stopWords(@NonNull @NonNull Collection<T> stopList)
stopList
- public SequenceVectors.Builder<T> windowSize(int windowSize)
windowSize
- public SequenceVectors.Builder<T> seed(long randomSeed)
randomSeed
- public SequenceVectors.Builder<T> modelUtils(@NonNull @NonNull ModelUtils<T> modelUtils)
modelUtils
- model utils to be usedpublic SequenceVectors.Builder<T> useUnknown(boolean reallyUse)
reallyUse
- public SequenceVectors.Builder<T> unknownElement(@NonNull T element)
element
- public SequenceVectors.Builder<T> useVariableWindow(int... windows)
windows
- public SequenceVectors.Builder<T> usePreciseWeightInit(boolean reallyUse)
reallyUse
- public SequenceVectors.Builder<T> usePreciseMode(boolean reallyUse)
protected void presetTables()
public SequenceVectors.Builder<T> setVectorsListeners(@NonNull @NonNull Collection<VectorsListener<T>> listeners)
listeners
- public SequenceVectors.Builder<T> enableScavenger(boolean reallyEnable)
reallyEnable
- public SequenceVectors.Builder<T> intersectModel(@NonNull @NonNull SequenceVectors<T> intersectVectors, boolean lockFactor)
public SequenceVectors<T> build()
Copyright © 2022. All rights reserved.