Interface RecordReader

    • Field Detail

      • NAME_SPACE

        static final String NAME_SPACE
      • APPEND_LABEL

        static final String APPEND_LABEL
      • LABELS

        static final String LABELS
    • Method Detail

      • batchesSupported

        boolean batchesSupported()
        This method returns true, if next(int) signature is supported by this RecordReader implementation.
        Returns:
      • next

        List<List<Writable>> next​(int num)
        This method will be used, if batchesSupported() returns true.
        Parameters:
        num -
        Returns:
      • hasNext

        boolean hasNext()
        Whether there are anymore records
        Returns:
      • getLabels

        List<String> getLabels()
        List of label strings
        Returns:
      • reset

        void reset()
        Reset record reader iterator
      • resetSupported

        boolean resetSupported()
        Returns:
        True if the record reader can be reset, false otherwise. Note that some record readers cannot be reset - for example, if they are backed by a non-resettable input split (such as certain types of streams)
      • record

        List<Writable> record​(URI uri,
                              DataInputStream dataInputStream)
                       throws IOException
        Load the record from the given DataInputStream Unlike next() the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStream
        Throws:
        IOException - if error occurs during reading from the input stream
      • nextRecord

        Record nextRecord()
        Similar to next(), but returns a Record object, that may include metadata such as the source of the data
        Returns:
        next record
      • loadFromMetaData

        Record loadFromMetaData​(RecordMetaData recordMetaData)
                         throws IOException
        Load a single record from the given RecordMetaData instance
        Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once using loadFromMetaData(List)
        Parameters:
        recordMetaData - Metadata for the record that we want to load from
        Returns:
        Single record for the given RecordMetaData instance
        Throws:
        IOException - If I/O error occurs during loading
      • loadFromMetaData

        List<Record> loadFromMetaData​(List<RecordMetaData> recordMetaDatas)
                               throws IOException
        Load multiple records from the given a list of RecordMetaData instances
        Parameters:
        recordMetaDatas - Metadata for the records that we want to load from
        Returns:
        Multiple records for the given RecordMetaData instances
        Throws:
        IOException - If I/O error occurs during loading
      • getListeners

        List<RecordListener> getListeners()
        Get the record listeners for this record reader.
      • setListeners

        void setListeners​(RecordListener... listeners)
        Set the record listeners for this record reader.
      • setListeners

        void setListeners​(Collection<RecordListener> listeners)
        Set the record listeners for this record reader.