public abstract class AbstractTableInputFormat<T> extends org.apache.flink.api.common.io.RichInputFormat<T,TableInputSplit>
InputFormat
to read data from HBase tables.Modifier and Type | Field and Description |
---|---|
protected byte[] |
currentRow |
protected boolean |
endReached |
protected static org.slf4j.Logger |
LOG |
protected org.apache.hadoop.hbase.client.ResultScanner |
resultScanner
HBase iterator wrapper.
|
protected org.apache.hadoop.hbase.client.Scan |
scan |
protected long |
scannedRows |
protected org.apache.hadoop.hbase.client.HTable |
table |
Constructor and Description |
---|
AbstractTableInputFormat() |
Modifier and Type | Method and Description |
---|---|
void |
close() |
void |
closeInputFormat() |
abstract void |
configure(org.apache.flink.configuration.Configuration parameters)
Creates a
Scan object and opens the HTable connection. |
TableInputSplit[] |
createInputSplits(int minNumSplits) |
org.apache.flink.core.io.InputSplitAssigner |
getInputSplitAssigner(TableInputSplit[] inputSplits) |
protected abstract org.apache.hadoop.hbase.client.Scan |
getScanner()
Returns an instance of Scan that retrieves the required subset of records from the HBase table.
|
org.apache.flink.api.common.io.statistics.BaseStatistics |
getStatistics(org.apache.flink.api.common.io.statistics.BaseStatistics cachedStatistics) |
protected abstract String |
getTableName()
What table is to be read.
|
protected boolean |
includeRegionInScan(byte[] startKey,
byte[] endKey)
Test if the given region is to be included in the scan while splitting the regions of a table.
|
protected abstract T |
mapResultToOutType(org.apache.hadoop.hbase.client.Result r)
HBase returns an instance of
Result . |
T |
nextRecord(T reuse) |
void |
open(TableInputSplit split) |
boolean |
reachedEnd() |
protected static final org.slf4j.Logger LOG
protected boolean endReached
protected transient org.apache.hadoop.hbase.client.HTable table
protected transient org.apache.hadoop.hbase.client.Scan scan
protected org.apache.hadoop.hbase.client.ResultScanner resultScanner
protected byte[] currentRow
protected long scannedRows
protected abstract org.apache.hadoop.hbase.client.Scan getScanner()
protected abstract String getTableName()
Per instance of a TableInputFormat derivative only a single table name is possible.
protected abstract T mapResultToOutType(org.apache.hadoop.hbase.client.Result r)
Result
.
This method maps the returned Result
instance into the output type T
.
r
- The Result instance from HBase that needs to be convertedT
that contains the data of Result.public abstract void configure(org.apache.flink.configuration.Configuration parameters)
Scan
object and opens the HTable
connection.
These are opened here because they are needed in the createInputSplits which is called before the openInputFormat method.
The connection is opened in this method and closed in closeInputFormat()
.
parameters
- The configuration that is to be usedConfiguration
public void open(TableInputSplit split) throws IOException
IOException
public T nextRecord(T reuse) throws IOException
IOException
public boolean reachedEnd() throws IOException
IOException
public void close() throws IOException
IOException
public void closeInputFormat() throws IOException
closeInputFormat
in class org.apache.flink.api.common.io.RichInputFormat<T,TableInputSplit>
IOException
public TableInputSplit[] createInputSplits(int minNumSplits) throws IOException
IOException
protected boolean includeRegionInScan(byte[] startKey, byte[] endKey)
startKey
- Start key of the regionendKey
- End key of the regionpublic org.apache.flink.core.io.InputSplitAssigner getInputSplitAssigner(TableInputSplit[] inputSplits)
public org.apache.flink.api.common.io.statistics.BaseStatistics getStatistics(org.apache.flink.api.common.io.statistics.BaseStatistics cachedStatistics)
Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.