Class CSVMultiSequenceRecordReader
- java.lang.Object
-
- org.datavec.api.records.reader.BaseRecordReader
-
- org.datavec.api.records.reader.impl.LineRecordReader
-
- org.datavec.api.records.reader.impl.csv.CSVRecordReader
-
- org.datavec.api.records.reader.impl.csv.CSVMultiSequenceRecordReader
-
- All Implemented Interfaces:
Closeable
,Serializable
,AutoCloseable
,Configurable
,RecordReader
,SequenceRecordReader
public class CSVMultiSequenceRecordReader extends CSVRecordReader implements SequenceRecordReader
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
CSVMultiSequenceRecordReader.Mode
-
Field Summary
-
Fields inherited from class org.datavec.api.records.reader.impl.csv.CSVRecordReader
DEFAULT_DELIMITER, DEFAULT_QUOTE, DELIMITER, QUOTE, SKIP_NUM_LINES, skipNumLines
-
Fields inherited from class org.datavec.api.records.reader.impl.LineRecordReader
charset, conf, initialized, lineIndex, locations, splitIndex
-
Fields inherited from class org.datavec.api.records.reader.BaseRecordReader
inputSplit, listeners, streamCreatorFn
-
Fields inherited from interface org.datavec.api.records.reader.RecordReader
APPEND_LABEL, LABELS, NAME_SPACE
-
-
Constructor Summary
Constructors Constructor Description CSVMultiSequenceRecordReader(int skipNumLines, char elementDelimiter, char quote, String sequenceSeparatorRegex, CSVMultiSequenceRecordReader.Mode mode, Writable padValue)
Create a sequence reader using the default value for skip lines (0), the default delimiter (',') and the default quote character ('"')CSVMultiSequenceRecordReader(String sequenceSeparatorRegex, CSVMultiSequenceRecordReader.Mode mode)
Create a sequence reader using the default value for skip lines (0), the default delimiter (',') and the default quote character ('"').
Note that this constructor cannot be used withCSVMultiSequenceRecordReader.Mode.PAD
as the padding value cannot be specifiedCSVMultiSequenceRecordReader(String sequenceSeparatorRegex, CSVMultiSequenceRecordReader.Mode mode, Writable padValue)
Create a sequence reader using the default value for skip lines (0), the default delimiter (',') and the default quote character ('"')
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
batchesSupported()
This method returns true, if next(int) signature is supported by this RecordReader implementation.List<SequenceRecord>
loadSequenceFromMetaData(List<RecordMetaData> recordMetaDatas)
Load multiple sequence records from the given a list ofRecordMetaData
instancesSequenceRecord
loadSequenceFromMetaData(RecordMetaData recordMetaData)
Load a single sequence record from the givenRecordMetaData
instance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingSequenceRecordReader.loadSequenceFromMetaData(List)
SequenceRecord
nextSequence()
Similar toSequenceRecordReader.sequenceRecord()
, but returns aRecord
object, that may include metadata such as the source of the dataList<List<Writable>>
sequenceRecord()
Returns a sequence record.List<List<Writable>>
sequenceRecord(URI uri, DataInputStream dataInputStream)
Load a sequence record from the given DataInputStream UnlikeRecordReader.next()
the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStream-
Methods inherited from class org.datavec.api.records.reader.impl.csv.CSVRecordReader
hasNext, initialize, loadFromMetaData, loadFromMetaData, next, next, nextRecord, onLocationOpen, parseLine, readStringLine, record, reset
-
Methods inherited from class org.datavec.api.records.reader.impl.LineRecordReader
close, closeIfRequired, getConf, getIterator, getLabels, initialize, resetSupported, setConf
-
Methods inherited from class org.datavec.api.records.reader.BaseRecordReader
getListeners, invokeListeners, setListeners, setListeners
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.datavec.api.conf.Configurable
getConf, setConf
-
Methods inherited from interface org.datavec.api.records.reader.RecordReader
getLabels, getListeners, hasNext, initialize, initialize, loadFromMetaData, loadFromMetaData, next, next, nextRecord, record, reset, resetSupported, setListeners, setListeners
-
-
-
-
Constructor Detail
-
CSVMultiSequenceRecordReader
public CSVMultiSequenceRecordReader(String sequenceSeparatorRegex, CSVMultiSequenceRecordReader.Mode mode)
Create a sequence reader using the default value for skip lines (0), the default delimiter (',') and the default quote character ('"').
Note that this constructor cannot be used withCSVMultiSequenceRecordReader.Mode.PAD
as the padding value cannot be specified- Parameters:
sequenceSeparatorRegex
- The sequence separator regex. Use "^$" for "sequences are separated by an empty linemode
- Mode: seeCSVMultiSequenceRecordReader
javadoc
-
CSVMultiSequenceRecordReader
public CSVMultiSequenceRecordReader(String sequenceSeparatorRegex, CSVMultiSequenceRecordReader.Mode mode, Writable padValue)
Create a sequence reader using the default value for skip lines (0), the default delimiter (',') and the default quote character ('"')- Parameters:
sequenceSeparatorRegex
- The sequence separator regex. Use "^$" for "sequences are separated by an empty linemode
- Mode: seeCSVMultiSequenceRecordReader
javadocpadValue
- Padding value for padding short sequences. Only used/allowable withCSVMultiSequenceRecordReader.Mode.PAD
, should be null otherwise
-
CSVMultiSequenceRecordReader
public CSVMultiSequenceRecordReader(int skipNumLines, char elementDelimiter, char quote, String sequenceSeparatorRegex, CSVMultiSequenceRecordReader.Mode mode, Writable padValue)
Create a sequence reader using the default value for skip lines (0), the default delimiter (',') and the default quote character ('"')- Parameters:
skipNumLines
- Number of lines to skipelementDelimiter
- Delimiter for elements - i.e., ',' if lines are comma separatedsequenceSeparatorRegex
- The sequence separator regex. Use "^$" for "sequences are separated by an empty linemode
- Mode: seeCSVMultiSequenceRecordReader
javadocpadValue
- Padding value for padding short sequences. Only used/allowable withCSVMultiSequenceRecordReader.Mode.PAD
, should be null otherwise
-
-
Method Detail
-
sequenceRecord
public List<List<Writable>> sequenceRecord()
Description copied from interface:SequenceRecordReader
Returns a sequence record.- Specified by:
sequenceRecord
in interfaceSequenceRecordReader
- Returns:
- a sequence of records
-
nextSequence
public SequenceRecord nextSequence()
Description copied from interface:SequenceRecordReader
Similar toSequenceRecordReader.sequenceRecord()
, but returns aRecord
object, that may include metadata such as the source of the data- Specified by:
nextSequence
in interfaceSequenceRecordReader
- Returns:
- next sequence record
-
sequenceRecord
public List<List<Writable>> sequenceRecord(URI uri, DataInputStream dataInputStream) throws IOException
Description copied from interface:SequenceRecordReader
Load a sequence record from the given DataInputStream UnlikeRecordReader.next()
the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStream- Specified by:
sequenceRecord
in interfaceSequenceRecordReader
- Throws:
IOException
- if error occurs during reading from the input stream
-
loadSequenceFromMetaData
public SequenceRecord loadSequenceFromMetaData(RecordMetaData recordMetaData) throws IOException
Description copied from interface:SequenceRecordReader
Load a single sequence record from the givenRecordMetaData
instance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingSequenceRecordReader.loadSequenceFromMetaData(List)
- Specified by:
loadSequenceFromMetaData
in interfaceSequenceRecordReader
- Parameters:
recordMetaData
- Metadata for the sequence record that we want to load from- Returns:
- Single sequence record for the given RecordMetaData instance
- Throws:
IOException
- If I/O error occurs during loading
-
loadSequenceFromMetaData
public List<SequenceRecord> loadSequenceFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
Description copied from interface:SequenceRecordReader
Load multiple sequence records from the given a list ofRecordMetaData
instances- Specified by:
loadSequenceFromMetaData
in interfaceSequenceRecordReader
- Parameters:
recordMetaDatas
- Metadata for the records that we want to load from- Returns:
- Multiple sequence record for the given RecordMetaData instances
- Throws:
IOException
- If I/O error occurs during loading
-
batchesSupported
public boolean batchesSupported()
Description copied from interface:RecordReader
This method returns true, if next(int) signature is supported by this RecordReader implementation.- Specified by:
batchesSupported
in interfaceRecordReader
- Overrides:
batchesSupported
in classCSVRecordReader
- Returns:
-
-