Class SequenceVectors.Builder<T extends SequenceElement>
- java.lang.Object
-
- org.deeplearning4j.models.sequencevectors.SequenceVectors.Builder<T>
-
- Direct Known Subclasses:
Node2Vec.Builder
,Word2Vec.Builder
- Enclosing class:
- SequenceVectors<T extends SequenceElement>
public static class SequenceVectors.Builder<T extends SequenceElement> extends Object
-
-
Field Summary
-
Constructor Summary
Constructors Constructor Description Builder()
Builder(@NonNull VectorsConfiguration configuration)
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description SequenceVectors.Builder<T>
batchSize(int batchSize)
This method defines batchSize option, viable only if iterations > 1SequenceVectors<T>
build()
Build SequenceVectors instance with defined settings/optionsSequenceVectors.Builder<T>
elementsLearningAlgorithm(@NonNull String algoName)
* Sets specific LearningAlgorithm as Elements Learning AlgorithmSequenceVectors.Builder<T>
elementsLearningAlgorithm(@NonNull ElementsLearningAlgorithm<T> algorithm)
* Sets specific LearningAlgorithm as Elements Learning AlgorithmSequenceVectors.Builder<T>
enableScavenger(boolean reallyEnable)
This method ebables/disables periodical vocab truncation during construction Default value: disabledSequenceVectors.Builder<T>
epochs(int numEpochs)
This method defines how much iterations should be done over whole training corpus during modellingSequenceVectors.Builder<T>
intersectModel(@NonNull SequenceVectors<T> intersectVectors, boolean lockFactor)
SequenceVectors.Builder<T>
iterate(@NonNull SequenceIterator<T> iterator)
This method defines SequenceIterator to be used for model buildingSequenceVectors.Builder<T>
iterations(int iterations)
This method defines how much iterations should be done over batched sequences.SequenceVectors.Builder<T>
layerSize(int layerSize)
This method defines number of dimensions for outcome vectors.SequenceVectors.Builder<T>
learningRate(double learningRate)
This method defines initial learning rate.SequenceVectors.Builder
limitVocabularySize(int limit)
This method sets vocabulary limit during construction.SequenceVectors.Builder<T>
lookupTable(@NonNull WeightLookupTable<T> lookupTable)
You can pass externally built WeightLookupTable, containing model weights and vocabulary.SequenceVectors.Builder<T>
minLearningRate(double minLearningRate)
This method defines minimum learning rate after decay being applied.SequenceVectors.Builder<T>
minWordFrequency(int minWordFrequency)
This method defines minimal element frequency for elements found in the training corpus.SequenceVectors.Builder<T>
modelUtils(@NonNull ModelUtils<T> modelUtils)
ModelUtils implementation, that will be used to access model.SequenceVectors.Builder<T>
negativeSample(double negative)
This method defines negative sampling value for skip-gram algorithm.protected void
presetTables()
This method creates new WeightLookupTableand VocabCache if there were none set SequenceVectors.Builder<T>
resetModel(boolean reallyReset)
This method defines, should all model be reset before training.SequenceVectors.Builder<T>
sampling(double sampling)
This method defines sub-sampling threshold.SequenceVectors.Builder<T>
seed(long randomSeed)
Sets seed for random numbers generator.SequenceVectors.Builder<T>
sequenceLearningAlgorithm(@NonNull String algoName)
Sets specific LearningAlgorithm as Sequence Learning AlgorithmSequenceVectors.Builder<T>
sequenceLearningAlgorithm(@NonNull SequenceLearningAlgorithm<T> algorithm)
Sets specific LearningAlgorithm as Sequence Learning AlgorithmSequenceVectors.Builder<T>
setVectorsListeners(@NonNull Collection<VectorsListener<T>> listeners)
This method sets VectorsListeners for this SequenceVectors modelSequenceVectors.Builder<T>
stopWords(@NonNull Collection<T> stopList)
You can provide collection of objects to be ignored, and excluded out of model Please note: Object labels and hashCode will be used for filteringSequenceVectors.Builder<T>
stopWords(@NonNull List<String> stopList)
You can provide collection of objects to be ignored, and excluded out of model Please note: Object labels and hashCode will be used for filteringSequenceVectors.Builder<T>
trainElementsRepresentation(boolean trainElements)
SequenceVectors.Builder<T>
trainSequencesRepresentation(boolean trainSequences)
SequenceVectors.Builder<T>
unknownElement(T element)
This method allows you to specify SequenceElement that will be used as UNK element, if UNK is usedSequenceVectors.Builder<T>
useAdaGrad(boolean reallyUse)
Deprecated.protected SequenceVectors.Builder<T>
useExistingWordVectors(@NonNull WordVectors vec)
This method allows you to use pre-built WordVectors model (e.g.SequenceVectors.Builder<T>
useHierarchicSoftmax(boolean reallyUse)
Enable/disable hierarchic softmaxSequenceVectors.Builder<T>
usePreciseMode(boolean reallyUse)
SequenceVectors.Builder<T>
usePreciseWeightInit(boolean reallyUse)
If set to true, initial weights for elements/sequences will be derived from elements themself.SequenceVectors.Builder<T>
useUnknown(boolean reallyUse)
This method allows you to specify, if UNK word should be used internallySequenceVectors.Builder<T>
useVariableWindow(int... windows)
This method allows to use variable window size.SequenceVectors.Builder<T>
vocabCache(@NonNull VocabCache<T> vocabCache)
You can pass externally built vocabCache object, containing vocabularySequenceVectors.Builder<T>
windowSize(int windowSize)
Sets window size for skip-Gram trainingSequenceVectors.Builder<T>
workers(int numWorkers)
Sets number of worker threads to be used in calculations
-
-
-
Field Detail
-
vocabCache
protected VocabCache<T extends SequenceElement> vocabCache
-
lookupTable
protected WeightLookupTable<T extends SequenceElement> lookupTable
-
iterator
protected SequenceIterator<T extends SequenceElement> iterator
-
modelUtils
protected ModelUtils<T extends SequenceElement> modelUtils
-
existingVectors
protected WordVectors existingVectors
-
intersectVectors
protected SequenceVectors<T extends SequenceElement> intersectVectors
-
lockFactor
protected boolean lockFactor
-
sampling
protected double sampling
-
negative
protected double negative
-
learningRate
protected double learningRate
-
minLearningRate
protected double minLearningRate
-
minWordFrequency
protected int minWordFrequency
-
iterations
protected int iterations
-
numEpochs
protected int numEpochs
-
layerSize
protected int layerSize
-
window
protected int window
-
hugeModelExpected
protected boolean hugeModelExpected
-
batchSize
protected int batchSize
-
learningRateDecayWords
protected int learningRateDecayWords
-
seed
protected long seed
-
useAdaGrad
protected boolean useAdaGrad
-
resetModel
protected boolean resetModel
-
workers
protected int workers
-
useUnknown
protected boolean useUnknown
-
useHierarchicSoftmax
protected boolean useHierarchicSoftmax
-
variableWindows
protected int[] variableWindows
-
trainSequenceVectors
protected boolean trainSequenceVectors
-
trainElementsVectors
protected boolean trainElementsVectors
-
preciseWeightInit
protected boolean preciseWeightInit
-
stopWords
protected Collection<String> stopWords
-
configuration
protected VectorsConfiguration configuration
-
unknownElement
protected transient T extends SequenceElement unknownElement
-
UNK
protected String UNK
-
STOP
protected String STOP
-
enableScavenger
protected boolean enableScavenger
-
vocabLimit
protected int vocabLimit
-
preciseMode
protected boolean preciseMode
Experimental field. Switches on precise mode for batch operations.
-
elementsLearningAlgorithm
protected ElementsLearningAlgorithm<T extends SequenceElement> elementsLearningAlgorithm
-
sequenceLearningAlgorithm
protected SequenceLearningAlgorithm<T extends SequenceElement> sequenceLearningAlgorithm
-
vectorsListeners
protected Set<VectorsListener<T extends SequenceElement>> vectorsListeners
-
-
Constructor Detail
-
Builder
public Builder()
-
Builder
public Builder(@NonNull @NonNull VectorsConfiguration configuration)
-
-
Method Detail
-
useExistingWordVectors
protected SequenceVectors.Builder<T> useExistingWordVectors(@NonNull @NonNull WordVectors vec)
This method allows you to use pre-built WordVectors model (e.g. SkipGram) for DBOW sequence learning. Existing model will be transferred into new model before training starts. PLEASE NOTE: This model has no effect for elements learning algorithms. Only sequence learning is affected. PLEASE NOTE: Non-normalized model is recommended to use here.- Parameters:
vec
- existing WordVectors model- Returns:
-
iterate
public SequenceVectors.Builder<T> iterate(@NonNull @NonNull SequenceIterator<T> iterator)
This method defines SequenceIterator to be used for model building- Parameters:
iterator
-- Returns:
-
sequenceLearningAlgorithm
public SequenceVectors.Builder<T> sequenceLearningAlgorithm(@NonNull @NonNull String algoName)
Sets specific LearningAlgorithm as Sequence Learning Algorithm- Parameters:
algoName
- fully qualified class name- Returns:
-
sequenceLearningAlgorithm
public SequenceVectors.Builder<T> sequenceLearningAlgorithm(@NonNull @NonNull SequenceLearningAlgorithm<T> algorithm)
Sets specific LearningAlgorithm as Sequence Learning Algorithm- Parameters:
algorithm
- SequenceLearningAlgorithm implementation- Returns:
-
elementsLearningAlgorithm
public SequenceVectors.Builder<T> elementsLearningAlgorithm(@NonNull @NonNull String algoName)
* Sets specific LearningAlgorithm as Elements Learning Algorithm- Parameters:
algoName
- fully qualified class name- Returns:
-
elementsLearningAlgorithm
public SequenceVectors.Builder<T> elementsLearningAlgorithm(@NonNull @NonNull ElementsLearningAlgorithm<T> algorithm)
* Sets specific LearningAlgorithm as Elements Learning Algorithm- Parameters:
algorithm
- ElementsLearningAlgorithm implementation- Returns:
-
batchSize
public SequenceVectors.Builder<T> batchSize(int batchSize)
This method defines batchSize option, viable only if iterations > 1- Parameters:
batchSize
-- Returns:
-
iterations
public SequenceVectors.Builder<T> iterations(int iterations)
This method defines how much iterations should be done over batched sequences.- Parameters:
iterations
-- Returns:
-
epochs
public SequenceVectors.Builder<T> epochs(int numEpochs)
This method defines how much iterations should be done over whole training corpus during modelling- Parameters:
numEpochs
-- Returns:
-
workers
public SequenceVectors.Builder<T> workers(int numWorkers)
Sets number of worker threads to be used in calculations- Parameters:
numWorkers
-- Returns:
-
useHierarchicSoftmax
public SequenceVectors.Builder<T> useHierarchicSoftmax(boolean reallyUse)
Enable/disable hierarchic softmax- Parameters:
reallyUse
-- Returns:
-
useAdaGrad
@Deprecated public SequenceVectors.Builder<T> useAdaGrad(boolean reallyUse)
Deprecated.This method defines if Adaptive Gradients should be used in calculations- Parameters:
reallyUse
-- Returns:
-
layerSize
public SequenceVectors.Builder<T> layerSize(int layerSize)
This method defines number of dimensions for outcome vectors. Please note: This option has effect only if lookupTable wasn't defined during building process.- Parameters:
layerSize
-- Returns:
-
learningRate
public SequenceVectors.Builder<T> learningRate(double learningRate)
This method defines initial learning rate. Default value is 0.025- Parameters:
learningRate
-- Returns:
-
minWordFrequency
public SequenceVectors.Builder<T> minWordFrequency(int minWordFrequency)
This method defines minimal element frequency for elements found in the training corpus. All elements with frequency below this threshold will be removed before training. Please note: this method has effect only if vocabulary is built internally.- Parameters:
minWordFrequency
-- Returns:
-
limitVocabularySize
public SequenceVectors.Builder limitVocabularySize(int limit)
This method sets vocabulary limit during construction. Default value: 0. Means no limit- Parameters:
limit
-- Returns:
-
minLearningRate
public SequenceVectors.Builder<T> minLearningRate(double minLearningRate)
This method defines minimum learning rate after decay being applied. Default value is 0.01- Parameters:
minLearningRate
-- Returns:
-
resetModel
public SequenceVectors.Builder<T> resetModel(boolean reallyReset)
This method defines, should all model be reset before training. If set to true, vocabulary and WeightLookupTable will be reset before training, and will be built from scratches- Parameters:
reallyReset
-- Returns:
-
vocabCache
public SequenceVectors.Builder<T> vocabCache(@NonNull @NonNull VocabCache<T> vocabCache)
You can pass externally built vocabCache object, containing vocabulary- Parameters:
vocabCache
-- Returns:
-
lookupTable
public SequenceVectors.Builder<T> lookupTable(@NonNull @NonNull WeightLookupTable<T> lookupTable)
You can pass externally built WeightLookupTable, containing model weights and vocabulary.- Parameters:
lookupTable
-- Returns:
-
sampling
public SequenceVectors.Builder<T> sampling(double sampling)
This method defines sub-sampling threshold.- Parameters:
sampling
-- Returns:
-
negativeSample
public SequenceVectors.Builder<T> negativeSample(double negative)
This method defines negative sampling value for skip-gram algorithm.- Parameters:
negative
-- Returns:
-
stopWords
public SequenceVectors.Builder<T> stopWords(@NonNull @NonNull List<String> stopList)
You can provide collection of objects to be ignored, and excluded out of model Please note: Object labels and hashCode will be used for filtering- Parameters:
stopList
-- Returns:
-
trainElementsRepresentation
public SequenceVectors.Builder<T> trainElementsRepresentation(boolean trainElements)
- Parameters:
trainElements
-- Returns:
-
trainSequencesRepresentation
public SequenceVectors.Builder<T> trainSequencesRepresentation(boolean trainSequences)
-
stopWords
public SequenceVectors.Builder<T> stopWords(@NonNull @NonNull Collection<T> stopList)
You can provide collection of objects to be ignored, and excluded out of model Please note: Object labels and hashCode will be used for filtering- Parameters:
stopList
-- Returns:
-
windowSize
public SequenceVectors.Builder<T> windowSize(int windowSize)
Sets window size for skip-Gram training- Parameters:
windowSize
-- Returns:
-
seed
public SequenceVectors.Builder<T> seed(long randomSeed)
Sets seed for random numbers generator. Please note: this has effect only if vocabulary and WeightLookupTable is built internally- Parameters:
randomSeed
-- Returns:
-
modelUtils
public SequenceVectors.Builder<T> modelUtils(@NonNull @NonNull ModelUtils<T> modelUtils)
ModelUtils implementation, that will be used to access model. Methods like: similarity, wordsNearest, accuracy are provided by user-defined ModelUtils- Parameters:
modelUtils
- model utils to be used- Returns:
-
useUnknown
public SequenceVectors.Builder<T> useUnknown(boolean reallyUse)
This method allows you to specify, if UNK word should be used internally- Parameters:
reallyUse
-- Returns:
-
unknownElement
public SequenceVectors.Builder<T> unknownElement(@NonNull T element)
This method allows you to specify SequenceElement that will be used as UNK element, if UNK is used- Parameters:
element
-- Returns:
-
useVariableWindow
public SequenceVectors.Builder<T> useVariableWindow(int... windows)
This method allows to use variable window size. In this case, every batch gets processed using one of predefined window sizes- Parameters:
windows
-- Returns:
-
usePreciseWeightInit
public SequenceVectors.Builder<T> usePreciseWeightInit(boolean reallyUse)
If set to true, initial weights for elements/sequences will be derived from elements themself. However, this implies additional cycle through input iterator. Default value: FALSE- Parameters:
reallyUse
-- Returns:
-
usePreciseMode
public SequenceVectors.Builder<T> usePreciseMode(boolean reallyUse)
-
presetTables
protected void presetTables()
This method creates new WeightLookupTableand VocabCache if there were none set
-
setVectorsListeners
public SequenceVectors.Builder<T> setVectorsListeners(@NonNull @NonNull Collection<VectorsListener<T>> listeners)
This method sets VectorsListeners for this SequenceVectors model- Parameters:
listeners
-- Returns:
-
enableScavenger
public SequenceVectors.Builder<T> enableScavenger(boolean reallyEnable)
This method ebables/disables periodical vocab truncation during construction Default value: disabled- Parameters:
reallyEnable
-- Returns:
-
intersectModel
public SequenceVectors.Builder<T> intersectModel(@NonNull @NonNull SequenceVectors<T> intersectVectors, boolean lockFactor)
-
build
public SequenceVectors<T> build()
Build SequenceVectors instance with defined settings/options- Returns:
-
-