HTableUtil (HBase - Client 0.95.2-hadoop2 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.hbase.client
Class HTableUtil

java.lang.Object
  org.apache.hadoop.hbase.client.HTableUtil

@InterfaceAudience.Public @InterfaceStability.Stable public class HTableUtil
extends Object
extends Object

Utility class for HTable.

Constructor Summary
`HTableUtil()`

Method Summary
`static void`	`bucketRsBatch(HTable htable, List<Row> rows)` Processes a List of Rows (Put, Delete) and writes them to an HTable instance in RegionServer buckets via the htable.batch method.
`static void`	`bucketRsPut(HTable htable, List<Put> puts)` Processes a List of Puts and writes them to an HTable instance in RegionServer buckets via the htable.put method.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

HTableUtil

public HTableUtil()

Method Detail

bucketRsPut

public static void bucketRsPut(HTable htable,
                               List<Put> puts)
                        throws IOException

Processes a List of Puts and writes them to an HTable instance in RegionServer buckets via the htable.put method. This will utilize the writeBuffer, thus the writeBuffer flush frequency may be tuned accordingly via htable.setWriteBufferSize.

The benefit of submitting Puts in this manner is to minimize the number of RegionServer RPCs in each flush.

Assumption #1: Regions have been pre-created for the table. If they haven't, then all of the Puts will go to the same region, defeating the purpose of this utility method. See the Apache HBase book for an explanation of how to do this.
Assumption #2: Row-keys are not monotonically increasing. See the Apache HBase book for an explanation of this problem.
Assumption #3: That the input list of Puts is big enough to be useful (in the thousands or more). The intent of this method is to process larger chunks of data.
Assumption #4: htable.setAutoFlush(false) has been set. This is a requirement to use the writeBuffer.

Parameters:: htable - HTable instance for target HBase table; puts - List of Put instances
Throws:: IOException - if a remote or network exception occurs

bucketRsBatch

public static void bucketRsBatch(HTable htable,
                                 List<Row> rows)
                          throws IOException

Processes a List of Rows (Put, Delete) and writes them to an HTable instance in RegionServer buckets via the htable.batch method.

The benefit of submitting Puts in this manner is to minimize the number of RegionServer RPCs, thus this will produce one RPC of Puts per RegionServer.

Assumption #1: Regions have been pre-created for the table. If they haven't, then all of the Puts will go to the same region, defeating the purpose of this utility method. See the Apache HBase book for an explanation of how to do this.
Assumption #2: Row-keys are not monotonically increasing. See the Apache HBase book for an explanation of this problem.
Assumption #3: That the input list of Rows is big enough to be useful (in the thousands or more). The intent of this method is to process larger chunks of data.

This method accepts a list of Row objects because the underlying .batch method accepts a list of Row objects.

Parameters:: htable - HTable instance for target HBase table; rows - List of Row instances
Throws:: IOException - if a remote or network exception occurs