Class LineRecordReader
- java.lang.Object
-
- org.datavec.api.records.reader.BaseRecordReader
-
- org.datavec.api.records.reader.impl.LineRecordReader
-
- All Implemented Interfaces:
Closeable
,Serializable
,AutoCloseable
,Configurable
,RecordReader
- Direct Known Subclasses:
CSVRecordReader
,JacksonLineRecordReader
,RegexLineRecordReader
,SVMLightRecordReader
public class LineRecordReader extends BaseRecordReader
Reads files line by line- Author:
- Adam Gibson
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected String
charset
protected Configuration
conf
protected boolean
initialized
protected int
lineIndex
protected URI[]
locations
protected int
splitIndex
-
Fields inherited from class org.datavec.api.records.reader.BaseRecordReader
inputSplit, listeners, streamCreatorFn
-
Fields inherited from interface org.datavec.api.records.reader.RecordReader
APPEND_LABEL, LABELS, NAME_SPACE
-
-
Constructor Summary
Constructors Constructor Description LineRecordReader()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
protected void
closeIfRequired(Iterator<String> iterator)
Configuration
getConf()
Return the configuration used by this object.protected Iterator<String>
getIterator(int location)
List<String>
getLabels()
List of label stringsboolean
hasNext()
Whether there are anymore recordsvoid
initialize(Configuration conf, InputSplit split)
Called once at initialization.void
initialize(InputSplit split)
Called once at initialization.List<Record>
loadFromMetaData(List<RecordMetaData> recordMetaDatas)
Load multiple records from the given a list ofRecordMetaData
instancesRecord
loadFromMetaData(RecordMetaData recordMetaData)
Load a single record from the givenRecordMetaData
instance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingRecordReader.loadFromMetaData(List)
List<Writable>
next()
Get the next recordRecord
nextRecord()
Similar toRecordReader.next()
, but returns aRecord
object, that may include metadata such as the source of the dataprotected void
onLocationOpen(URI location)
List<Writable>
record(URI uri, DataInputStream dataInputStream)
Load the record from the given DataInputStream UnlikeRecordReader.next()
the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStreamvoid
reset()
Reset record reader iteratorboolean
resetSupported()
void
setConf(Configuration conf)
Set the configuration to be used by this object.-
Methods inherited from class org.datavec.api.records.reader.BaseRecordReader
batchesSupported, getListeners, invokeListeners, next, setListeners, setListeners
-
-
-
-
Field Detail
-
locations
protected URI[] locations
-
splitIndex
protected int splitIndex
-
lineIndex
protected int lineIndex
-
conf
protected Configuration conf
-
initialized
protected boolean initialized
-
charset
protected String charset
-
-
Method Detail
-
initialize
public void initialize(InputSplit split) throws IOException, InterruptedException
Description copied from interface:RecordReader
Called once at initialization.- Specified by:
initialize
in interfaceRecordReader
- Overrides:
initialize
in classBaseRecordReader
- Parameters:
split
- the split that defines the range of records to read- Throws:
IOException
InterruptedException
-
initialize
public void initialize(Configuration conf, InputSplit split) throws IOException, InterruptedException
Description copied from interface:RecordReader
Called once at initialization.- Parameters:
conf
- a configuration for initializationsplit
- the split that defines the range of records to read- Throws:
IOException
InterruptedException
-
next
public List<Writable> next()
Description copied from interface:RecordReader
Get the next record- Returns:
-
hasNext
public boolean hasNext()
Description copied from interface:RecordReader
Whether there are anymore records- Returns:
-
onLocationOpen
protected void onLocationOpen(URI location)
-
close
public void close() throws IOException
- Throws:
IOException
-
setConf
public void setConf(Configuration conf)
Description copied from interface:Configurable
Set the configuration to be used by this object.
-
getConf
public Configuration getConf()
Description copied from interface:Configurable
Return the configuration used by this object.
-
getLabels
public List<String> getLabels()
Description copied from interface:RecordReader
List of label strings- Returns:
-
reset
public void reset()
Description copied from interface:RecordReader
Reset record reader iterator
-
resetSupported
public boolean resetSupported()
- Returns:
- True if the record reader can be reset, false otherwise. Note that some record readers cannot be reset - for example, if they are backed by a non-resettable input split (such as certain types of streams)
-
record
public List<Writable> record(URI uri, DataInputStream dataInputStream) throws IOException
Description copied from interface:RecordReader
Load the record from the given DataInputStream UnlikeRecordReader.next()
the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStream- Throws:
IOException
- if error occurs during reading from the input stream
-
nextRecord
public Record nextRecord()
Description copied from interface:RecordReader
Similar toRecordReader.next()
, but returns aRecord
object, that may include metadata such as the source of the data- Returns:
- next record
-
loadFromMetaData
public Record loadFromMetaData(RecordMetaData recordMetaData) throws IOException
Description copied from interface:RecordReader
Load a single record from the givenRecordMetaData
instance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingRecordReader.loadFromMetaData(List)
- Parameters:
recordMetaData
- Metadata for the record that we want to load from- Returns:
- Single record for the given RecordMetaData instance
- Throws:
IOException
- If I/O error occurs during loading
-
loadFromMetaData
public List<Record> loadFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
Description copied from interface:RecordReader
Load multiple records from the given a list ofRecordMetaData
instances- Parameters:
recordMetaDatas
- Metadata for the records that we want to load from- Returns:
- Multiple records for the given RecordMetaData instances
- Throws:
IOException
- If I/O error occurs during loading
-
-