Class SequenceVectors.Builder<T extends SequenceElement>
- java.lang.Object
-
- org.deeplearning4j.models.sequencevectors.SequenceVectors.Builder<T>
-
- Direct Known Subclasses:
Node2Vec.Builder,Word2Vec.Builder
- Enclosing class:
- SequenceVectors<T extends SequenceElement>
public static class SequenceVectors.Builder<T extends SequenceElement> extends Object
-
-
Field Summary
-
Constructor Summary
Constructors Constructor Description Builder()Builder(@NonNull VectorsConfiguration configuration)
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description SequenceVectors.Builder<T>batchSize(int batchSize)This method defines batchSize option, viable only if iterations > 1SequenceVectors<T>build()Build SequenceVectors instance with defined settings/optionsSequenceVectors.Builder<T>elementsLearningAlgorithm(@NonNull String algoName)* Sets specific LearningAlgorithm as Elements Learning AlgorithmSequenceVectors.Builder<T>elementsLearningAlgorithm(@NonNull ElementsLearningAlgorithm<T> algorithm)* Sets specific LearningAlgorithm as Elements Learning AlgorithmSequenceVectors.Builder<T>enableScavenger(boolean reallyEnable)This method ebables/disables periodical vocab truncation during construction Default value: disabledSequenceVectors.Builder<T>epochs(int numEpochs)This method defines how much iterations should be done over whole training corpus during modellingSequenceVectors.Builder<T>intersectModel(@NonNull SequenceVectors<T> intersectVectors, boolean lockFactor)SequenceVectors.Builder<T>iterate(@NonNull SequenceIterator<T> iterator)This method defines SequenceIterator to be used for model buildingSequenceVectors.Builder<T>iterations(int iterations)This method defines how much iterations should be done over batched sequences.SequenceVectors.Builder<T>layerSize(int layerSize)This method defines number of dimensions for outcome vectors.SequenceVectors.Builder<T>learningRate(double learningRate)This method defines initial learning rate.SequenceVectors.BuilderlimitVocabularySize(int limit)This method sets vocabulary limit during construction.SequenceVectors.Builder<T>lookupTable(@NonNull WeightLookupTable<T> lookupTable)You can pass externally built WeightLookupTable, containing model weights and vocabulary.SequenceVectors.Builder<T>minLearningRate(double minLearningRate)This method defines minimum learning rate after decay being applied.SequenceVectors.Builder<T>minWordFrequency(int minWordFrequency)This method defines minimal element frequency for elements found in the training corpus.SequenceVectors.Builder<T>modelUtils(@NonNull ModelUtils<T> modelUtils)ModelUtils implementation, that will be used to access model.SequenceVectors.Builder<T>negativeSample(double negative)This method defines negative sampling value for skip-gram algorithm.protected voidpresetTables()This method creates new WeightLookupTableand VocabCache if there were none set SequenceVectors.Builder<T>resetModel(boolean reallyReset)This method defines, should all model be reset before training.SequenceVectors.Builder<T>sampling(double sampling)This method defines sub-sampling threshold.SequenceVectors.Builder<T>seed(long randomSeed)Sets seed for random numbers generator.SequenceVectors.Builder<T>sequenceLearningAlgorithm(@NonNull String algoName)Sets specific LearningAlgorithm as Sequence Learning AlgorithmSequenceVectors.Builder<T>sequenceLearningAlgorithm(@NonNull SequenceLearningAlgorithm<T> algorithm)Sets specific LearningAlgorithm as Sequence Learning AlgorithmSequenceVectors.Builder<T>setVectorsListeners(@NonNull Collection<VectorsListener<T>> listeners)This method sets VectorsListeners for this SequenceVectors modelSequenceVectors.Builder<T>stopWords(@NonNull Collection<T> stopList)You can provide collection of objects to be ignored, and excluded out of model Please note: Object labels and hashCode will be used for filteringSequenceVectors.Builder<T>stopWords(@NonNull List<String> stopList)You can provide collection of objects to be ignored, and excluded out of model Please note: Object labels and hashCode will be used for filteringSequenceVectors.Builder<T>trainElementsRepresentation(boolean trainElements)SequenceVectors.Builder<T>trainSequencesRepresentation(boolean trainSequences)SequenceVectors.Builder<T>unknownElement(T element)This method allows you to specify SequenceElement that will be used as UNK element, if UNK is usedSequenceVectors.Builder<T>useAdaGrad(boolean reallyUse)Deprecated.protected SequenceVectors.Builder<T>useExistingWordVectors(@NonNull WordVectors vec)This method allows you to use pre-built WordVectors model (e.g.SequenceVectors.Builder<T>useHierarchicSoftmax(boolean reallyUse)Enable/disable hierarchic softmaxSequenceVectors.Builder<T>usePreciseMode(boolean reallyUse)SequenceVectors.Builder<T>usePreciseWeightInit(boolean reallyUse)If set to true, initial weights for elements/sequences will be derived from elements themself.SequenceVectors.Builder<T>useUnknown(boolean reallyUse)This method allows you to specify, if UNK word should be used internallySequenceVectors.Builder<T>useVariableWindow(int... windows)This method allows to use variable window size.SequenceVectors.Builder<T>vocabCache(@NonNull VocabCache<T> vocabCache)You can pass externally built vocabCache object, containing vocabularySequenceVectors.Builder<T>windowSize(int windowSize)Sets window size for skip-Gram trainingSequenceVectors.Builder<T>workers(int numWorkers)Sets number of worker threads to be used in calculations
-
-
-
Field Detail
-
vocabCache
protected VocabCache<T extends SequenceElement> vocabCache
-
lookupTable
protected WeightLookupTable<T extends SequenceElement> lookupTable
-
iterator
protected SequenceIterator<T extends SequenceElement> iterator
-
modelUtils
protected ModelUtils<T extends SequenceElement> modelUtils
-
existingVectors
protected WordVectors existingVectors
-
intersectVectors
protected SequenceVectors<T extends SequenceElement> intersectVectors
-
lockFactor
protected boolean lockFactor
-
sampling
protected double sampling
-
negative
protected double negative
-
learningRate
protected double learningRate
-
minLearningRate
protected double minLearningRate
-
minWordFrequency
protected int minWordFrequency
-
iterations
protected int iterations
-
numEpochs
protected int numEpochs
-
layerSize
protected int layerSize
-
window
protected int window
-
hugeModelExpected
protected boolean hugeModelExpected
-
batchSize
protected int batchSize
-
learningRateDecayWords
protected int learningRateDecayWords
-
seed
protected long seed
-
useAdaGrad
protected boolean useAdaGrad
-
resetModel
protected boolean resetModel
-
workers
protected int workers
-
useUnknown
protected boolean useUnknown
-
useHierarchicSoftmax
protected boolean useHierarchicSoftmax
-
variableWindows
protected int[] variableWindows
-
trainSequenceVectors
protected boolean trainSequenceVectors
-
trainElementsVectors
protected boolean trainElementsVectors
-
preciseWeightInit
protected boolean preciseWeightInit
-
stopWords
protected Collection<String> stopWords
-
configuration
protected VectorsConfiguration configuration
-
unknownElement
protected transient T extends SequenceElement unknownElement
-
UNK
protected String UNK
-
STOP
protected String STOP
-
enableScavenger
protected boolean enableScavenger
-
vocabLimit
protected int vocabLimit
-
preciseMode
protected boolean preciseMode
Experimental field. Switches on precise mode for batch operations.
-
elementsLearningAlgorithm
protected ElementsLearningAlgorithm<T extends SequenceElement> elementsLearningAlgorithm
-
sequenceLearningAlgorithm
protected SequenceLearningAlgorithm<T extends SequenceElement> sequenceLearningAlgorithm
-
vectorsListeners
protected Set<VectorsListener<T extends SequenceElement>> vectorsListeners
-
-
Constructor Detail
-
Builder
public Builder()
-
Builder
public Builder(@NonNull @NonNull VectorsConfiguration configuration)
-
-
Method Detail
-
useExistingWordVectors
protected SequenceVectors.Builder<T> useExistingWordVectors(@NonNull @NonNull WordVectors vec)
This method allows you to use pre-built WordVectors model (e.g. SkipGram) for DBOW sequence learning. Existing model will be transferred into new model before training starts. PLEASE NOTE: This model has no effect for elements learning algorithms. Only sequence learning is affected. PLEASE NOTE: Non-normalized model is recommended to use here.- Parameters:
vec- existing WordVectors model- Returns:
-
iterate
public SequenceVectors.Builder<T> iterate(@NonNull @NonNull SequenceIterator<T> iterator)
This method defines SequenceIterator to be used for model building- Parameters:
iterator-- Returns:
-
sequenceLearningAlgorithm
public SequenceVectors.Builder<T> sequenceLearningAlgorithm(@NonNull @NonNull String algoName)
Sets specific LearningAlgorithm as Sequence Learning Algorithm- Parameters:
algoName- fully qualified class name- Returns:
-
sequenceLearningAlgorithm
public SequenceVectors.Builder<T> sequenceLearningAlgorithm(@NonNull @NonNull SequenceLearningAlgorithm<T> algorithm)
Sets specific LearningAlgorithm as Sequence Learning Algorithm- Parameters:
algorithm- SequenceLearningAlgorithm implementation- Returns:
-
elementsLearningAlgorithm
public SequenceVectors.Builder<T> elementsLearningAlgorithm(@NonNull @NonNull String algoName)
* Sets specific LearningAlgorithm as Elements Learning Algorithm- Parameters:
algoName- fully qualified class name- Returns:
-
elementsLearningAlgorithm
public SequenceVectors.Builder<T> elementsLearningAlgorithm(@NonNull @NonNull ElementsLearningAlgorithm<T> algorithm)
* Sets specific LearningAlgorithm as Elements Learning Algorithm- Parameters:
algorithm- ElementsLearningAlgorithm implementation- Returns:
-
batchSize
public SequenceVectors.Builder<T> batchSize(int batchSize)
This method defines batchSize option, viable only if iterations > 1- Parameters:
batchSize-- Returns:
-
iterations
public SequenceVectors.Builder<T> iterations(int iterations)
This method defines how much iterations should be done over batched sequences.- Parameters:
iterations-- Returns:
-
epochs
public SequenceVectors.Builder<T> epochs(int numEpochs)
This method defines how much iterations should be done over whole training corpus during modelling- Parameters:
numEpochs-- Returns:
-
workers
public SequenceVectors.Builder<T> workers(int numWorkers)
Sets number of worker threads to be used in calculations- Parameters:
numWorkers-- Returns:
-
useHierarchicSoftmax
public SequenceVectors.Builder<T> useHierarchicSoftmax(boolean reallyUse)
Enable/disable hierarchic softmax- Parameters:
reallyUse-- Returns:
-
useAdaGrad
@Deprecated public SequenceVectors.Builder<T> useAdaGrad(boolean reallyUse)
Deprecated.This method defines if Adaptive Gradients should be used in calculations- Parameters:
reallyUse-- Returns:
-
layerSize
public SequenceVectors.Builder<T> layerSize(int layerSize)
This method defines number of dimensions for outcome vectors. Please note: This option has effect only if lookupTable wasn't defined during building process.- Parameters:
layerSize-- Returns:
-
learningRate
public SequenceVectors.Builder<T> learningRate(double learningRate)
This method defines initial learning rate. Default value is 0.025- Parameters:
learningRate-- Returns:
-
minWordFrequency
public SequenceVectors.Builder<T> minWordFrequency(int minWordFrequency)
This method defines minimal element frequency for elements found in the training corpus. All elements with frequency below this threshold will be removed before training. Please note: this method has effect only if vocabulary is built internally.- Parameters:
minWordFrequency-- Returns:
-
limitVocabularySize
public SequenceVectors.Builder limitVocabularySize(int limit)
This method sets vocabulary limit during construction. Default value: 0. Means no limit- Parameters:
limit-- Returns:
-
minLearningRate
public SequenceVectors.Builder<T> minLearningRate(double minLearningRate)
This method defines minimum learning rate after decay being applied. Default value is 0.01- Parameters:
minLearningRate-- Returns:
-
resetModel
public SequenceVectors.Builder<T> resetModel(boolean reallyReset)
This method defines, should all model be reset before training. If set to true, vocabulary and WeightLookupTable will be reset before training, and will be built from scratches- Parameters:
reallyReset-- Returns:
-
vocabCache
public SequenceVectors.Builder<T> vocabCache(@NonNull @NonNull VocabCache<T> vocabCache)
You can pass externally built vocabCache object, containing vocabulary- Parameters:
vocabCache-- Returns:
-
lookupTable
public SequenceVectors.Builder<T> lookupTable(@NonNull @NonNull WeightLookupTable<T> lookupTable)
You can pass externally built WeightLookupTable, containing model weights and vocabulary.- Parameters:
lookupTable-- Returns:
-
sampling
public SequenceVectors.Builder<T> sampling(double sampling)
This method defines sub-sampling threshold.- Parameters:
sampling-- Returns:
-
negativeSample
public SequenceVectors.Builder<T> negativeSample(double negative)
This method defines negative sampling value for skip-gram algorithm.- Parameters:
negative-- Returns:
-
stopWords
public SequenceVectors.Builder<T> stopWords(@NonNull @NonNull List<String> stopList)
You can provide collection of objects to be ignored, and excluded out of model Please note: Object labels and hashCode will be used for filtering- Parameters:
stopList-- Returns:
-
trainElementsRepresentation
public SequenceVectors.Builder<T> trainElementsRepresentation(boolean trainElements)
- Parameters:
trainElements-- Returns:
-
trainSequencesRepresentation
public SequenceVectors.Builder<T> trainSequencesRepresentation(boolean trainSequences)
-
stopWords
public SequenceVectors.Builder<T> stopWords(@NonNull @NonNull Collection<T> stopList)
You can provide collection of objects to be ignored, and excluded out of model Please note: Object labels and hashCode will be used for filtering- Parameters:
stopList-- Returns:
-
windowSize
public SequenceVectors.Builder<T> windowSize(int windowSize)
Sets window size for skip-Gram training- Parameters:
windowSize-- Returns:
-
seed
public SequenceVectors.Builder<T> seed(long randomSeed)
Sets seed for random numbers generator. Please note: this has effect only if vocabulary and WeightLookupTable is built internally- Parameters:
randomSeed-- Returns:
-
modelUtils
public SequenceVectors.Builder<T> modelUtils(@NonNull @NonNull ModelUtils<T> modelUtils)
ModelUtils implementation, that will be used to access model. Methods like: similarity, wordsNearest, accuracy are provided by user-defined ModelUtils- Parameters:
modelUtils- model utils to be used- Returns:
-
useUnknown
public SequenceVectors.Builder<T> useUnknown(boolean reallyUse)
This method allows you to specify, if UNK word should be used internally- Parameters:
reallyUse-- Returns:
-
unknownElement
public SequenceVectors.Builder<T> unknownElement(@NonNull T element)
This method allows you to specify SequenceElement that will be used as UNK element, if UNK is used- Parameters:
element-- Returns:
-
useVariableWindow
public SequenceVectors.Builder<T> useVariableWindow(int... windows)
This method allows to use variable window size. In this case, every batch gets processed using one of predefined window sizes- Parameters:
windows-- Returns:
-
usePreciseWeightInit
public SequenceVectors.Builder<T> usePreciseWeightInit(boolean reallyUse)
If set to true, initial weights for elements/sequences will be derived from elements themself. However, this implies additional cycle through input iterator. Default value: FALSE- Parameters:
reallyUse-- Returns:
-
usePreciseMode
public SequenceVectors.Builder<T> usePreciseMode(boolean reallyUse)
-
presetTables
protected void presetTables()
This method creates new WeightLookupTableand VocabCache if there were none set
-
setVectorsListeners
public SequenceVectors.Builder<T> setVectorsListeners(@NonNull @NonNull Collection<VectorsListener<T>> listeners)
This method sets VectorsListeners for this SequenceVectors model- Parameters:
listeners-- Returns:
-
enableScavenger
public SequenceVectors.Builder<T> enableScavenger(boolean reallyEnable)
This method ebables/disables periodical vocab truncation during construction Default value: disabled- Parameters:
reallyEnable-- Returns:
-
intersectModel
public SequenceVectors.Builder<T> intersectModel(@NonNull @NonNull SequenceVectors<T> intersectVectors, boolean lockFactor)
-
build
public SequenceVectors<T> build()
Build SequenceVectors instance with defined settings/options- Returns:
-
-