Class EmnistDataSetIterator

  • All Implemented Interfaces:
    Serializable, Iterator<org.nd4j.linalg.dataset.DataSet>, org.nd4j.linalg.dataset.api.iterator.DataSetIterator

    public class EmnistDataSetIterator
    extends org.nd4j.linalg.dataset.api.iterator.BaseDatasetIterator
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected int batch  
      protected EMnistSet dataSet  
      protected int numExamples  
      protected org.nd4j.linalg.dataset.api.DataSetPreProcessor preProcessor  
      • Fields inherited from class org.nd4j.linalg.dataset.api.iterator.BaseDatasetIterator

        fetcher, topLevelDir
    • Constructor Summary

      Constructors 
      Constructor Description
      EmnistDataSetIterator​(EMnistSet dataSet, int batch, boolean train)
      Create an EMNIST iterator with randomly shuffled data based on a random RNG seed
      EmnistDataSetIterator​(EMnistSet dataSet, int batch, boolean binarize, boolean train, boolean shuffle, long rngSeed)
      Get the specified number of MNIST examples (test or train set), with optional shuffling and binarization.
      EmnistDataSetIterator​(EMnistSet dataSet, int batch, boolean binarize, boolean train, boolean shuffle, long rngSeed, File topLevelDir)
      Get the specified number of MNIST examples (test or train set), with optional shuffling and binarization.
      EmnistDataSetIterator​(EMnistSet dataSet, int batchSize, boolean train, long seed)
      Create an EMNIST iterator with randomly shuffled data based on a specified RNG seed
    • Field Detail

      • batch

        protected int batch
      • numExamples

        protected int numExamples
      • preProcessor

        protected org.nd4j.linalg.dataset.api.DataSetPreProcessor preProcessor
    • Constructor Detail

      • EmnistDataSetIterator

        public EmnistDataSetIterator​(EMnistSet dataSet,
                                     int batch,
                                     boolean train)
                              throws IOException
        Create an EMNIST iterator with randomly shuffled data based on a random RNG seed
        Parameters:
        dataSet - Dataset (subset) to return
        batch - Batch size
        train - If true: use training set. If false: use test set
        Throws:
        IOException - If an error occurs when loading/downloading the dataset
      • EmnistDataSetIterator

        public EmnistDataSetIterator​(EMnistSet dataSet,
                                     int batchSize,
                                     boolean train,
                                     long seed)
                              throws IOException
        Create an EMNIST iterator with randomly shuffled data based on a specified RNG seed
        Parameters:
        dataSet - Dataset (subset) to return
        batchSize - Batch size
        train - If true: use training set. If false: use test set
        seed - Random number generator seed
        Throws:
        IOException
      • EmnistDataSetIterator

        public EmnistDataSetIterator​(EMnistSet dataSet,
                                     int batch,
                                     boolean binarize,
                                     boolean train,
                                     boolean shuffle,
                                     long rngSeed,
                                     File topLevelDir)
                              throws IOException
        Get the specified number of MNIST examples (test or train set), with optional shuffling and binarization.
        Parameters:
        batch - Size of each minibatch
        binarize - whether to binarize the data or not (if false: normalize in range 0 to 1)
        train - Train vs. test set
        shuffle - whether to shuffle the examples
        rngSeed - random number generator seed to use when shuffling examples
        Throws:
        IOException
      • EmnistDataSetIterator

        public EmnistDataSetIterator​(EMnistSet dataSet,
                                     int batch,
                                     boolean binarize,
                                     boolean train,
                                     boolean shuffle,
                                     long rngSeed)
                              throws IOException
        Get the specified number of MNIST examples (test or train set), with optional shuffling and binarization.
        Parameters:
        batch - Size of each minibatch
        binarize - whether to binarize the data or not (if false: normalize in range 0 to 1)
        train - Train vs. test set
        shuffle - whether to shuffle the examples
        rngSeed - random number generator seed to use when shuffling examples
        Throws:
        IOException
    • Method Detail

      • numExamplesTrain

        public static int numExamplesTrain​(EMnistSet dataSet)
        Get the number of training examples for the specified subset
        Parameters:
        dataSet - Subset to get
        Returns:
        Number of examples for the specified subset
      • numExamplesTest

        public static int numExamplesTest​(EMnistSet dataSet)
        Get the number of test examples for the specified subset
        Parameters:
        dataSet - Subset to get
        Returns:
        Number of examples for the specified subset
      • numLabels

        public static int numLabels​(EMnistSet dataSet)
        Get the number of labels for the specified subset
        Parameters:
        dataSet - Subset to get
        Returns:
        Number of labels for the specified subset
      • getLabelsArrays

        public char[] getLabelsArrays()
        Get the labels as a character array
        Returns:
        Labels
      • getLabels

        public List<String> getLabels()
        Get the labels as a List
        Specified by:
        getLabels in interface org.nd4j.linalg.dataset.api.iterator.DataSetIterator
        Overrides:
        getLabels in class org.nd4j.linalg.dataset.api.iterator.BaseDatasetIterator
        Returns:
        Labels
      • getLabelsArray

        public static char[] getLabelsArray​(EMnistSet dataSet)
        Get the label assignments for the given set as a character array.
        Parameters:
        dataSet - DataSet to get the label assignment for
        Returns:
        Label assignment and given dataset
      • getLabels

        public static List<String> getLabels​(EMnistSet dataSet)
        Get the label assignments for the given set as a List
        Parameters:
        dataSet - DataSet to get the label assignment for
        Returns:
        Label assignment and given dataset
      • isBalanced

        public static boolean isBalanced​(EMnistSet dataSet)
        Are the labels balanced in the training set (that is: are the number of examples for each label equal?)
        Parameters:
        dataSet - Set to get balanced value for
        Returns:
        True if balanced dataset, false otherwise