Package org.deeplearning4j.iterator
Class CnnSentenceDataSetIterator.Builder
- java.lang.Object
-
- org.deeplearning4j.iterator.CnnSentenceDataSetIterator.Builder
-
- Enclosing class:
- CnnSentenceDataSetIterator
public static class CnnSentenceDataSetIterator.Builder extends Object
-
-
Constructor Summary
Constructors Constructor Description Builder()
Deprecated.Due to old default, that will be changed in the future.Builder(@NonNull CnnSentenceDataSetIterator.Format format)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description CnnSentenceDataSetIterator
build()
CnnSentenceDataSetIterator.Builder
dataSetPreProcessor(org.nd4j.linalg.dataset.api.DataSetPreProcessor dataSetPreProcessor)
Optional DataSetPreProcessorCnnSentenceDataSetIterator.Builder
maxSentenceLength(int maxSentenceLength)
Maximum sentence/document length.CnnSentenceDataSetIterator.Builder
minibatchSize(int minibatchSize)
Minibatch size to use for the DataSetIteratorCnnSentenceDataSetIterator.Builder
sentenceProvider(LabeledSentenceProvider labeledSentenceProvider)
Specify how the (labelled) sentences / documents should be providedCnnSentenceDataSetIterator.Builder
sentenceProvider(LabelAwareDocumentIterator iterator, @NonNull List<String> labels)
Specify how the (labelled) sentences / documents should be providedCnnSentenceDataSetIterator.Builder
sentenceProvider(LabelAwareIterator iterator, @NonNull List<String> labels)
Specify how the (labelled) sentences / documents should be providedCnnSentenceDataSetIterator.Builder
sentenceProvider(LabelAwareSentenceIterator iterator, @NonNull List<String> labels)
Specify how the (labelled) sentences / documents should be providedCnnSentenceDataSetIterator.Builder
sentencesAlongHeight(boolean sentencesAlongHeight)
If true (default): output features data with shape [minibatchSize, 1, maxSentenceLength, wordVectorSize]
If false: output features with shape [minibatchSize, 1, wordVectorSize, maxSentenceLength]CnnSentenceDataSetIterator.Builder
tokenizerFactory(TokenizerFactory tokenizerFactory)
TheTokenizerFactory
that should be used.CnnSentenceDataSetIterator.Builder
unknownWordHandling(CnnSentenceDataSetIterator.UnknownWordHandling unknownWordHandling)
Specify how unknown words (those that don't have a word vector in the provided WordVectors instance) should be handled.CnnSentenceDataSetIterator.Builder
useNormalizedWordVectors(boolean useNormalizedWordVectors)
Whether normalized word vectors should be used.CnnSentenceDataSetIterator.Builder
wordVectors(WordVectors wordVectors)
Provide the WordVectors instance that should be used for training
-
-
-
Constructor Detail
-
Builder
@Deprecated public Builder()
Deprecated.Due to old default, that will be changed in the future. UseBuilder(Format)
to specify theCnnSentenceDataSetIterator.Format
of the activations
-
Builder
public Builder(@NonNull @NonNull CnnSentenceDataSetIterator.Format format)
- Parameters:
format
- The format to use for the features - i.e., for 1D or 2D CNNs
-
-
Method Detail
-
sentenceProvider
public CnnSentenceDataSetIterator.Builder sentenceProvider(LabeledSentenceProvider labeledSentenceProvider)
Specify how the (labelled) sentences / documents should be provided
-
sentenceProvider
public CnnSentenceDataSetIterator.Builder sentenceProvider(LabelAwareIterator iterator, @NonNull @NonNull List<String> labels)
Specify how the (labelled) sentences / documents should be provided
-
sentenceProvider
public CnnSentenceDataSetIterator.Builder sentenceProvider(LabelAwareDocumentIterator iterator, @NonNull @NonNull List<String> labels)
Specify how the (labelled) sentences / documents should be provided
-
sentenceProvider
public CnnSentenceDataSetIterator.Builder sentenceProvider(LabelAwareSentenceIterator iterator, @NonNull @NonNull List<String> labels)
Specify how the (labelled) sentences / documents should be provided
-
wordVectors
public CnnSentenceDataSetIterator.Builder wordVectors(WordVectors wordVectors)
Provide the WordVectors instance that should be used for training
-
tokenizerFactory
public CnnSentenceDataSetIterator.Builder tokenizerFactory(TokenizerFactory tokenizerFactory)
TheTokenizerFactory
that should be used. Defaults toDefaultTokenizerFactory
-
unknownWordHandling
public CnnSentenceDataSetIterator.Builder unknownWordHandling(CnnSentenceDataSetIterator.UnknownWordHandling unknownWordHandling)
Specify how unknown words (those that don't have a word vector in the provided WordVectors instance) should be handled. Default: remove/ignore unknown words.
-
minibatchSize
public CnnSentenceDataSetIterator.Builder minibatchSize(int minibatchSize)
Minibatch size to use for the DataSetIterator
-
useNormalizedWordVectors
public CnnSentenceDataSetIterator.Builder useNormalizedWordVectors(boolean useNormalizedWordVectors)
Whether normalized word vectors should be used. Default: true
-
maxSentenceLength
public CnnSentenceDataSetIterator.Builder maxSentenceLength(int maxSentenceLength)
Maximum sentence/document length. If sentences exceed this, they will be truncated to this length by taking the first 'maxSentenceLength' known words.
-
sentencesAlongHeight
public CnnSentenceDataSetIterator.Builder sentencesAlongHeight(boolean sentencesAlongHeight)
If true (default): output features data with shape [minibatchSize, 1, maxSentenceLength, wordVectorSize]
If false: output features with shape [minibatchSize, 1, wordVectorSize, maxSentenceLength]
-
dataSetPreProcessor
public CnnSentenceDataSetIterator.Builder dataSetPreProcessor(org.nd4j.linalg.dataset.api.DataSetPreProcessor dataSetPreProcessor)
Optional DataSetPreProcessor
-
build
public CnnSentenceDataSetIterator build()
-
-