@InterfaceAudience.Public @InterfaceStability.Stable public class Scan extends OperationWithAttributes
All operations are identical to Get
with the exception of
instantiation. Rather than specifying a single row, an optional startRow
and stopRow may be defined. If rows are not specified, the Scanner will
iterate over all rows.
To scan everything for each row, instantiate a Scan object.
To modify scanner caching for just this scan, use setCaching
.
If caching is NOT set, we will use the caching value of the hosting HTable
. See
HTable.setScannerCaching(int)
. In addition to row caching, it is possible to specify a
maximum result size, using setMaxResultSize(long)
. When both are used,
single server requests are limited by either number of rows or maximum result size, whichever
limit comes first.
To further define the scope of what to get when scanning, perform additional methods as outlined below.
To get all columns from specific families, execute addFamily
for each family to retrieve.
To get specific columns, execute addColumn
for each column to retrieve.
To only retrieve columns within a specific range of version timestamps,
execute setTimeRange
.
To only retrieve columns with a specific timestamp, execute
setTimestamp
.
To limit the number of versions of each column to be returned, execute
setMaxVersions
.
To limit the maximum number of values returned for each call to next(),
execute setBatch
.
To add a filter, execute setFilter
.
Expert: To explicitly disable server-side block caching for this scan,
execute setCacheBlocks(boolean)
.
Modifier and Type | Field and Description |
---|---|
static String |
SCAN_ATTRIBUTES_METRICS_DATA |
static String |
SCAN_ATTRIBUTES_METRICS_ENABLE |
static String |
SCAN_ATTRIBUTES_TABLE_NAME |
ID_ATRIBUTE
Constructor and Description |
---|
Scan()
Create a Scan operation across all rows.
|
Scan(byte[] startRow)
Create a Scan operation starting at the specified row.
|
Scan(byte[] startRow,
byte[] stopRow)
Create a Scan operation for the range of rows specified.
|
Scan(byte[] startRow,
Filter filter) |
Scan(Get get)
Builds a scan object with the same specs as get.
|
Scan(Scan scan)
Creates a new instance of this class while copying all values.
|
Modifier and Type | Method and Description |
---|---|
Scan |
addColumn(byte[] family,
byte[] qualifier)
Get the column from the specified family with the specified qualifier.
|
Scan |
addFamily(byte[] family)
Get all columns from the specified family.
|
boolean |
doLoadColumnFamiliesOnDemand()
Get the logical value indicating whether on-demand CF loading should be allowed.
|
int |
getBatch() |
boolean |
getCacheBlocks()
Get whether blocks should be cached for this Scan.
|
int |
getCaching() |
byte[][] |
getFamilies() |
Map<byte[],NavigableSet<byte[]>> |
getFamilyMap()
Getting the familyMap
|
Filter |
getFilter() |
Map<String,Object> |
getFingerprint()
Compile the table and column family (i.e.
|
IsolationLevel |
getIsolationLevel() |
Boolean |
getLoadColumnFamiliesOnDemandValue()
Get the raw loadColumnFamiliesOnDemand setting; if it's not set, can be null.
|
long |
getMaxResultSize() |
int |
getMaxResultsPerColumnFamily() |
int |
getMaxVersions() |
int |
getRowOffsetPerColumnFamily()
Method for retrieving the scan's offset per row per column
family (#kvs to be skipped)
|
byte[] |
getStartRow() |
byte[] |
getStopRow() |
TimeRange |
getTimeRange() |
boolean |
hasFamilies() |
boolean |
hasFilter() |
boolean |
isGetScan() |
boolean |
isRaw() |
boolean |
isSmall()
Get whether this scan is a small scan
|
int |
numFamilies() |
void |
setBatch(int batch)
Set the maximum number of values to return for each call to next()
|
void |
setCacheBlocks(boolean cacheBlocks)
Set whether blocks should be cached for this Scan.
|
void |
setCaching(int caching)
Set the number of rows for caching that will be passed to scanners.
|
Scan |
setFamilyMap(Map<byte[],NavigableSet<byte[]>> familyMap)
Setting the familyMap
|
Scan |
setFilter(Filter filter)
Apply the specified server-side filter when performing the Scan.
|
void |
setIsolationLevel(IsolationLevel level) |
void |
setLoadColumnFamiliesOnDemand(boolean value)
Set the value indicating whether loading CFs on demand should be allowed (cluster
default is false).
|
void |
setMaxResultSize(long maxResultSize)
Set the maximum result size.
|
void |
setMaxResultsPerColumnFamily(int limit)
Set the maximum number of values to return per row per Column Family
|
Scan |
setMaxVersions()
Get all available versions.
|
Scan |
setMaxVersions(int maxVersions)
Get up to the specified number of versions of each column.
|
void |
setRaw(boolean raw)
Enable/disable "raw" mode for this scan.
|
void |
setRowOffsetPerColumnFamily(int offset)
Set offset for the row per Column Family.
|
void |
setSmall(boolean small)
Set whether this scan is a small scan
|
Scan |
setStartRow(byte[] startRow)
Set the start row of the scan.
|
Scan |
setStopRow(byte[] stopRow)
Set the stop row.
|
Scan |
setTimeRange(long minStamp,
long maxStamp)
Get versions of columns only within the specified timestamp range,
[minStamp, maxStamp).
|
Scan |
setTimeStamp(long timestamp)
Get versions of columns with the specified timestamp.
|
Map<String,Object> |
toMap(int maxCols)
Compile the details beyond the scope of getFingerprint (row, columns,
timestamps, etc.) into a Map along with the fingerprinted information.
|
getAttribute, getAttributeSize, getAttributesMap, getId, setAttribute, setId
public static final String SCAN_ATTRIBUTES_METRICS_ENABLE
public static final String SCAN_ATTRIBUTES_METRICS_DATA
public static final String SCAN_ATTRIBUTES_TABLE_NAME
public Scan()
public Scan(byte[] startRow, Filter filter)
public Scan(byte[] startRow)
If the specified row does not exist, the Scanner will start from the next closest row after the specified row.
startRow
- row to start scanner at or afterpublic Scan(byte[] startRow, byte[] stopRow)
startRow
- row to start scanner at or after (inclusive)stopRow
- row to stop scanner before (exclusive)public Scan(Scan scan) throws IOException
scan
- The scan instance to copy from.IOException
- When copying the values fails.public Scan(Get get)
get
- get to model scan afterpublic boolean isGetScan()
public Scan addFamily(byte[] family)
Overrides previous calls to addColumn for this family.
family
- family namepublic Scan addColumn(byte[] family, byte[] qualifier)
Overrides previous calls to addFamily for this family.
family
- family namequalifier
- column qualifierpublic Scan setTimeRange(long minStamp, long maxStamp) throws IOException
minStamp
- minimum timestamp value, inclusivemaxStamp
- maximum timestamp value, exclusiveIOException
- if invalid time rangesetMaxVersions()
,
setMaxVersions(int)
public Scan setTimeStamp(long timestamp)
timestamp
- version timestampsetMaxVersions()
,
setMaxVersions(int)
public Scan setStartRow(byte[] startRow)
startRow
- row to start scan on (inclusive)
Note: In order to make startRow exclusive add a trailing 0 bytepublic Scan setStopRow(byte[] stopRow)
stopRow
- row to end at (exclusive)
Note: In order to make stopRow inclusive add a trailing 0 bytepublic Scan setMaxVersions()
public Scan setMaxVersions(int maxVersions)
maxVersions
- maximum versions for each columnpublic void setBatch(int batch)
batch
- the maximum number of valuespublic void setMaxResultsPerColumnFamily(int limit)
limit
- the maximum number of values returned / row / CFpublic void setRowOffsetPerColumnFamily(int offset)
offset
- is the number of kvs that will be skipped.public void setCaching(int caching)
HTable.getScannerCaching()
will apply.
Higher caching values will enable faster scanners but will use more memory.caching
- the number of rows for cachingpublic long getMaxResultSize()
setMaxResultSize(long)
public void setMaxResultSize(long maxResultSize)
maxResultSize
- The maximum result size in bytes.public Scan setFilter(Filter filter)
filter
- filter to run on the serverpublic Scan setFamilyMap(Map<byte[],NavigableSet<byte[]>> familyMap)
familyMap
- map of family to qualifierpublic Map<byte[],NavigableSet<byte[]>> getFamilyMap()
public int numFamilies()
public boolean hasFamilies()
public byte[][] getFamilies()
public byte[] getStartRow()
public byte[] getStopRow()
public int getMaxVersions()
public int getBatch()
public int getMaxResultsPerColumnFamily()
public int getRowOffsetPerColumnFamily()
public int getCaching()
public TimeRange getTimeRange()
public Filter getFilter()
public boolean hasFilter()
public void setCacheBlocks(boolean cacheBlocks)
This is true by default. When true, default settings of the table and family are used (this will never override caching blocks if the block cache is disabled for that family or entirely).
cacheBlocks
- if false, default settings are overridden and blocks
will not be cachedpublic boolean getCacheBlocks()
public void setLoadColumnFamiliesOnDemand(boolean value)
public Boolean getLoadColumnFamiliesOnDemandValue()
public boolean doLoadColumnFamiliesOnDemand()
public Map<String,Object> getFingerprint()
getFingerprint
in class Operation
public Map<String,Object> toMap(int maxCols)
public void setRaw(boolean raw)
raw
- True/False to enable/disable "raw" mode.public boolean isRaw()
public void setIsolationLevel(IsolationLevel level)
public IsolationLevel getIsolationLevel()
public void setSmall(boolean small)
Small scan should use pread and big scan can use seek + read seek + read is fast but can cause two problem (1) resource contention (2) cause too much network io [89-fb] Using pread for non-compaction read request https://issues.apache.org/jira/browse/HBASE-7266 On the other hand, if setting it true, we would do openScanner,next,closeScanner in one RPC call. It means the better performance for small scan. [HBASE-9488]. Generally, if the scan range is within one data block(64KB), it could be considered as a small scan.
small
- public boolean isSmall()
Copyright © 2013 The Apache Software Foundation. All Rights Reserved.