Class RegexLineRecordReader
- java.lang.Object
-
- org.datavec.api.records.reader.BaseRecordReader
-
- org.datavec.api.records.reader.impl.LineRecordReader
-
- org.datavec.api.records.reader.impl.regex.RegexLineRecordReader
-
- All Implemented Interfaces:
Closeable
,Serializable
,AutoCloseable
,Configurable
,RecordReader
public class RegexLineRecordReader extends LineRecordReader
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static String
SKIP_NUM_LINES
-
Fields inherited from class org.datavec.api.records.reader.impl.LineRecordReader
charset, conf, initialized, lineIndex, locations, splitIndex
-
Fields inherited from class org.datavec.api.records.reader.BaseRecordReader
inputSplit, listeners, streamCreatorFn
-
Fields inherited from interface org.datavec.api.records.reader.RecordReader
APPEND_LABEL, LABELS, NAME_SPACE
-
-
Constructor Summary
Constructors Constructor Description RegexLineRecordReader(String regex, int skipNumLines)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
initialize(Configuration conf, InputSplit split)
Called once at initialization.List<Record>
loadFromMetaData(List<RecordMetaData> recordMetaDatas)
Load multiple records from the given a list ofRecordMetaData
instancesRecord
loadFromMetaData(RecordMetaData recordMetaData)
Load a single record from the givenRecordMetaData
instance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingRecordReader.loadFromMetaData(List)
List<Writable>
next()
Get the next recordRecord
nextRecord()
Similar toRecordReader.next()
, but returns aRecord
object, that may include metadata such as the source of the dataList<Writable>
record(URI uri, DataInputStream dataInputStream)
Load the record from the given DataInputStream UnlikeRecordReader.next()
the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStreamvoid
reset()
Reset record reader iterator-
Methods inherited from class org.datavec.api.records.reader.impl.LineRecordReader
close, closeIfRequired, getConf, getIterator, getLabels, hasNext, initialize, onLocationOpen, resetSupported, setConf
-
Methods inherited from class org.datavec.api.records.reader.BaseRecordReader
batchesSupported, getListeners, invokeListeners, next, setListeners, setListeners
-
-
-
-
Field Detail
-
SKIP_NUM_LINES
public static final String SKIP_NUM_LINES
-
-
Constructor Detail
-
RegexLineRecordReader
public RegexLineRecordReader(String regex, int skipNumLines)
-
-
Method Detail
-
initialize
public void initialize(Configuration conf, InputSplit split) throws IOException, InterruptedException
Description copied from interface:RecordReader
Called once at initialization.- Specified by:
initialize
in interfaceRecordReader
- Overrides:
initialize
in classLineRecordReader
- Parameters:
conf
- a configuration for initializationsplit
- the split that defines the range of records to read- Throws:
IOException
InterruptedException
-
next
public List<Writable> next()
Description copied from interface:RecordReader
Get the next record- Specified by:
next
in interfaceRecordReader
- Overrides:
next
in classLineRecordReader
- Returns:
-
record
public List<Writable> record(URI uri, DataInputStream dataInputStream) throws IOException
Description copied from interface:RecordReader
Load the record from the given DataInputStream UnlikeRecordReader.next()
the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStream- Specified by:
record
in interfaceRecordReader
- Overrides:
record
in classLineRecordReader
- Throws:
IOException
- if error occurs during reading from the input stream
-
reset
public void reset()
Description copied from interface:RecordReader
Reset record reader iterator- Specified by:
reset
in interfaceRecordReader
- Overrides:
reset
in classLineRecordReader
-
nextRecord
public Record nextRecord()
Description copied from interface:RecordReader
Similar toRecordReader.next()
, but returns aRecord
object, that may include metadata such as the source of the data- Specified by:
nextRecord
in interfaceRecordReader
- Overrides:
nextRecord
in classLineRecordReader
- Returns:
- next record
-
loadFromMetaData
public Record loadFromMetaData(RecordMetaData recordMetaData) throws IOException
Description copied from interface:RecordReader
Load a single record from the givenRecordMetaData
instance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingRecordReader.loadFromMetaData(List)
- Specified by:
loadFromMetaData
in interfaceRecordReader
- Overrides:
loadFromMetaData
in classLineRecordReader
- Parameters:
recordMetaData
- Metadata for the record that we want to load from- Returns:
- Single record for the given RecordMetaData instance
- Throws:
IOException
- If I/O error occurs during loading
-
loadFromMetaData
public List<Record> loadFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
Description copied from interface:RecordReader
Load multiple records from the given a list ofRecordMetaData
instances- Specified by:
loadFromMetaData
in interfaceRecordReader
- Overrides:
loadFromMetaData
in classLineRecordReader
- Parameters:
recordMetaDatas
- Metadata for the records that we want to load from- Returns:
- Multiple records for the given RecordMetaData instances
- Throws:
IOException
- If I/O error occurs during loading
-
-