org.apache.accumulo.core.iterators.user
public class WholeRowIterator extends Object implements SortedKeyValueIterator<Key,Value>
One caveat is that when seeking in the WholeRowIterator using a range that starts at a non-inclusive first key in a row, (e.g. seek(new Range(new Key(new Text("row")),false,...),...)) this iterator will skip to the next row. This is done in order to prevent repeated scanning of the same row when system automatically creates ranges of that form, which happens in the case of the client calling continueScan, or in the case of the tablet server continuing a scan after swapping out sources.
To regain the original key/value pairs of the row, call the decodeRow function on the key/value pair that this iterator returned.
RowFilter
Constructor and Description |
---|
WholeRowIterator() |
Modifier and Type | Method and Description |
---|---|
static SortedMap<Key,Value> |
decodeRow(Key rowKey,
Value rowValue) |
SortedKeyValueIterator<Key,Value> |
deepCopy(IteratorEnvironment env)
Creates a deep copy of this iterator as though seek had not yet been called.
|
static Value |
encodeRow(List<Key> keys,
List<Value> values) |
protected boolean |
filter(org.apache.hadoop.io.Text currentRow,
List<Key> keys,
List<Value> values) |
Key |
getTopKey()
Returns top key.
|
Value |
getTopValue()
Returns top value.
|
boolean |
hasTop()
Returns true if the iterator has more elements.
|
void |
init(SortedKeyValueIterator<Key,Value> source,
Map<String,String> options,
IteratorEnvironment env)
Initializes the iterator.
|
void |
next()
Advances to the next K,V pair.
|
void |
seek(Range range,
Collection<ByteSequence> columnFamilies,
boolean inclusive)
Seeks to the first key in the Range, restricting the resulting K,V pairs to those with the specified columns.
|
public static final SortedMap<Key,Value> decodeRow(Key rowKey, Value rowValue) throws IOException
IOException
public static final Value encodeRow(List<Key> keys, List<Value> values) throws IOException
IOException
protected boolean filter(org.apache.hadoop.io.Text currentRow, List<Key> keys, List<Value> values)
currentRow
- All keys have this in their row portion (do not modify!).keys
- One key for each key in the row, ordered as they are given by the source iterator (do not modify!).values
- One value for each key in keys, ordered to correspond to the ordering in keys (do not modify!).public SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment env)
SortedKeyValueIterator
deepCopy
in interface SortedKeyValueIterator<Key,Value>
env
- IteratorEnvironment environment in which iterator is being run.public Key getTopKey()
SortedKeyValueIterator
getTopKey
in interface SortedKeyValueIterator<Key,Value>
public Value getTopValue()
SortedKeyValueIterator
getTopValue
in interface SortedKeyValueIterator<Key,Value>
public boolean hasTop()
SortedKeyValueIterator
hasTop
in interface SortedKeyValueIterator<Key,Value>
public void init(SortedKeyValueIterator<Key,Value> source, Map<String,String> options, IteratorEnvironment env) throws IOException
SortedKeyValueIterator
init
in interface SortedKeyValueIterator<Key,Value>
source
- SortedKeyValueIterator source to read data from.options
- Map map of string option names to option values.env
- IteratorEnvironment environment in which iterator is being run.IOException
- unused.public void next() throws IOException
SortedKeyValueIterator
next
in interface SortedKeyValueIterator<Key,Value>
IOException
- if an I/O error occurs.public void seek(Range range, Collection<ByteSequence> columnFamilies, boolean inclusive) throws IOException
SortedKeyValueIterator
SortedKeyValueIterator.init(org.apache.accumulo.core.iterators.SortedKeyValueIterator, java.util.Map, org.apache.accumulo.core.iterators.IteratorEnvironment)
is called.
Iterators that examine groups of adjacent key/value pairs (e.g. rows) to determine their top key and value should be sure that they properly handle a seek
to a key in the middle of such a group (e.g. the middle of a row). Even if the client always seeks to a range containing an entire group (a,c), the tablet
server could send back a batch of entries corresponding to (a,b], then reseek the iterator to range (b,c) when the scan is continued.
columnFamilies
is used, at the lowest level, to determine which data blocks inside of an RFile need to be opened for this iterator. This set of
data blocks is also the set of locality groups defined for the given table. If no columnFamilies are provided, the data blocks for all locality groups
inside of the correct RFile will be opened and seeked in an attempt to find the correct start key, regardless of the startKey in the range
.
In an Accumulo instance in which multiple locality groups exist for a table, it is important to ensure that columnFamilies
is properly set to the
minimum required column families to ensure that data from separate locality groups is not inadvertently read.seek
in interface SortedKeyValueIterator<Key,Value>
range
- Range of keys to iterate over.columnFamilies
- Collection of column families to include or exclude.inclusive
- boolean that indicates whether to include (true) or exclude (false) column families.IOException
- if an I/O error occurs.Copyright © 2011-2015 The Apache Software Foundation. All Rights Reserved.