Package org.deeplearning4j.iterator
Class CnnSentenceDataSetIterator.Builder
- java.lang.Object
-
- org.deeplearning4j.iterator.CnnSentenceDataSetIterator.Builder
-
- Enclosing class:
- CnnSentenceDataSetIterator
public static class CnnSentenceDataSetIterator.Builder extends Object
-
-
Constructor Summary
Constructors Constructor Description Builder()Deprecated.Due to old default, that will be changed in the future.Builder(@NonNull CnnSentenceDataSetIterator.Format format)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description CnnSentenceDataSetIteratorbuild()CnnSentenceDataSetIterator.BuilderdataSetPreProcessor(org.nd4j.linalg.dataset.api.DataSetPreProcessor dataSetPreProcessor)Optional DataSetPreProcessorCnnSentenceDataSetIterator.BuildermaxSentenceLength(int maxSentenceLength)Maximum sentence/document length.CnnSentenceDataSetIterator.BuilderminibatchSize(int minibatchSize)Minibatch size to use for the DataSetIteratorCnnSentenceDataSetIterator.BuildersentenceProvider(LabeledSentenceProvider labeledSentenceProvider)Specify how the (labelled) sentences / documents should be providedCnnSentenceDataSetIterator.BuildersentenceProvider(LabelAwareDocumentIterator iterator, @NonNull List<String> labels)Specify how the (labelled) sentences / documents should be providedCnnSentenceDataSetIterator.BuildersentenceProvider(LabelAwareIterator iterator, @NonNull List<String> labels)Specify how the (labelled) sentences / documents should be providedCnnSentenceDataSetIterator.BuildersentenceProvider(LabelAwareSentenceIterator iterator, @NonNull List<String> labels)Specify how the (labelled) sentences / documents should be providedCnnSentenceDataSetIterator.BuildersentencesAlongHeight(boolean sentencesAlongHeight)If true (default): output features data with shape [minibatchSize, 1, maxSentenceLength, wordVectorSize]
If false: output features with shape [minibatchSize, 1, wordVectorSize, maxSentenceLength]CnnSentenceDataSetIterator.BuildertokenizerFactory(TokenizerFactory tokenizerFactory)TheTokenizerFactorythat should be used.CnnSentenceDataSetIterator.BuilderunknownWordHandling(CnnSentenceDataSetIterator.UnknownWordHandling unknownWordHandling)Specify how unknown words (those that don't have a word vector in the provided WordVectors instance) should be handled.CnnSentenceDataSetIterator.BuilderuseNormalizedWordVectors(boolean useNormalizedWordVectors)Whether normalized word vectors should be used.CnnSentenceDataSetIterator.BuilderwordVectors(WordVectors wordVectors)Provide the WordVectors instance that should be used for training
-
-
-
Constructor Detail
-
Builder
@Deprecated public Builder()
Deprecated.Due to old default, that will be changed in the future. UseBuilder(Format)to specify theCnnSentenceDataSetIterator.Formatof the activations
-
Builder
public Builder(@NonNull @NonNull CnnSentenceDataSetIterator.Format format)- Parameters:
format- The format to use for the features - i.e., for 1D or 2D CNNs
-
-
Method Detail
-
sentenceProvider
public CnnSentenceDataSetIterator.Builder sentenceProvider(LabeledSentenceProvider labeledSentenceProvider)
Specify how the (labelled) sentences / documents should be provided
-
sentenceProvider
public CnnSentenceDataSetIterator.Builder sentenceProvider(LabelAwareIterator iterator, @NonNull @NonNull List<String> labels)
Specify how the (labelled) sentences / documents should be provided
-
sentenceProvider
public CnnSentenceDataSetIterator.Builder sentenceProvider(LabelAwareDocumentIterator iterator, @NonNull @NonNull List<String> labels)
Specify how the (labelled) sentences / documents should be provided
-
sentenceProvider
public CnnSentenceDataSetIterator.Builder sentenceProvider(LabelAwareSentenceIterator iterator, @NonNull @NonNull List<String> labels)
Specify how the (labelled) sentences / documents should be provided
-
wordVectors
public CnnSentenceDataSetIterator.Builder wordVectors(WordVectors wordVectors)
Provide the WordVectors instance that should be used for training
-
tokenizerFactory
public CnnSentenceDataSetIterator.Builder tokenizerFactory(TokenizerFactory tokenizerFactory)
TheTokenizerFactorythat should be used. Defaults toDefaultTokenizerFactory
-
unknownWordHandling
public CnnSentenceDataSetIterator.Builder unknownWordHandling(CnnSentenceDataSetIterator.UnknownWordHandling unknownWordHandling)
Specify how unknown words (those that don't have a word vector in the provided WordVectors instance) should be handled. Default: remove/ignore unknown words.
-
minibatchSize
public CnnSentenceDataSetIterator.Builder minibatchSize(int minibatchSize)
Minibatch size to use for the DataSetIterator
-
useNormalizedWordVectors
public CnnSentenceDataSetIterator.Builder useNormalizedWordVectors(boolean useNormalizedWordVectors)
Whether normalized word vectors should be used. Default: true
-
maxSentenceLength
public CnnSentenceDataSetIterator.Builder maxSentenceLength(int maxSentenceLength)
Maximum sentence/document length. If sentences exceed this, they will be truncated to this length by taking the first 'maxSentenceLength' known words.
-
sentencesAlongHeight
public CnnSentenceDataSetIterator.Builder sentencesAlongHeight(boolean sentencesAlongHeight)
If true (default): output features data with shape [minibatchSize, 1, maxSentenceLength, wordVectorSize]
If false: output features with shape [minibatchSize, 1, wordVectorSize, maxSentenceLength]
-
dataSetPreProcessor
public CnnSentenceDataSetIterator.Builder dataSetPreProcessor(org.nd4j.linalg.dataset.api.DataSetPreProcessor dataSetPreProcessor)
Optional DataSetPreProcessor
-
build
public CnnSentenceDataSetIterator build()
-
-