Class KFoldIterator

    • Field Detail

      • allData

        protected DataSet allData
      • k

        protected int k
      • N

        protected int N
      • intervalBoundaries

        protected int[] intervalBoundaries
      • kCursor

        protected int kCursor
    • Constructor Detail

      • KFoldIterator

        public KFoldIterator​(DataSet allData)
        Create a k-fold cross-validation iterator given the dataset and k=10 train-test splits. N number of samples are split into k batches. The first (N%k) batches contain (N/k)+1 samples, while the remaining batches contain (N/k) samples. In case the number of samples (N) in the dataset is a multiple of k, all batches will contain (N/k) samples.
        Parameters:
        allData - DataSet to split into k folds
      • KFoldIterator

        public KFoldIterator​(int k,
                             DataSet allData)
        Create an iterator given the dataset with given k train-test splits N number of samples are split into k batches. The first (N%k) batches contain (N/k)+1 samples, while the remaining batches contain (N/k) samples. In case the number of samples (N) in the dataset is a multiple of k, all batches will contain (N/k) samples.
        Parameters:
        k - number of folds (optional, defaults to 10)
        allData - DataSet to split into k folds
    • Method Detail

      • totalExamples

        public int totalExamples()
        Returns total number of examples in the dataset (all k folds)
        Returns:
        total number of examples in the dataset including all k folds
      • resetSupported

        public boolean resetSupported()
        Description copied from interface: DataSetIterator
        Is resetting supported by this DataSetIterator? Many DataSetIterators do support resetting, but some don't
        Specified by:
        resetSupported in interface DataSetIterator
        Returns:
        true if reset method is supported; false otherwise
      • asyncSupported

        public boolean asyncSupported()
        Description copied from interface: DataSetIterator
        Does this DataSetIterator support asynchronous prefetching of multiple DataSet objects? Most DataSetIterators do, but in some cases it may not make sense to wrap this iterator in an iterator that does asynchronous prefetching. For example, it would not make sense to use asynchronous prefetching for the following types of iterators: (a) Iterators that store their full contents in memory already (b) Iterators that re-use features/labels arrays (as future next() calls will overwrite past contents) (c) Iterators that already implement some level of asynchronous prefetching (d) Iterators that may return different data depending on when the next() method is called
        Specified by:
        asyncSupported in interface DataSetIterator
        Returns:
        true if asynchronous prefetching from this iterator is OK; false if asynchronous prefetching should not be used with this iterator
      • reset

        public void reset()
        Shuffles the dataset and resets to the first fold
        Specified by:
        reset in interface DataSetIterator
      • batch

        public int batch()
        The number of examples in every fold is (N / k), except when (N % k) > 0, when the first (N % k) folds contain (N / k) + 1 examples
        Specified by:
        batch in interface DataSetIterator
        Returns:
        examples in a fold
      • getLabels

        public List<String> getLabels()
        Description copied from interface: DataSetIterator
        Get dataset iterator class labels, if any. Note that implementations are not required to implement this, and can simply return null
        Specified by:
        getLabels in interface DataSetIterator
      • nextFold

        protected void nextFold()
      • testFold

        public DataSet testFold()
        Returns:
        the held out fold as a dataset