A simple abstraction over the HBaseContext.foreachPartition method.
A simple abstraction over the HBaseContext.foreachPartition method.
It allow addition support for a user to take RDD and generate checkAndPuts and send them to HBase. The complexity of managing the Connection is removed from the developer
Original RDD with data to iterate over
The name of the table to put into
Function to convert a value in the RDD to a HBase checkAndPut
If autoFlush should be turned on
A simple abstraction over the HBaseContext.foreachPartition method.
A simple abstraction over the HBaseContext.foreachPartition method.
It allow addition support for a user to take a RDD and generate checkAndDelete and send them to HBase. The complexity of managing the Connection is removed from the developer
Original RDD with data to iterate over
The name of the table to delete from
Function to convert a value in the RDD to a HBase Delete
A simple abstraction over the HBaseContext.foreachPartition method.
A simple abstraction over the HBaseContext.foreachPartition method.
It allow addition support for a user to take a RDD and generate delete and send them to HBase. The complexity of managing the Connection is removed from the developer
Original RDD with data to iterate over
The name of the table to delete from
Function to convert a value in the RDD to a HBase Deletes
The number of delete to batch before sending to HBase
A simple abstraction over the HBaseContext.
A simple abstraction over the HBaseContext.mapPartition method.
It allow addition support for a user to take a RDD and generates a new RDD based on Gets and the results they bring back from HBase
The name of the table to get from
Original RDD with data to iterate over
function to convert a value in the RDD to a HBase Get
This will convert the HBase Result object to what ever the user wants to put in the resulting RDD return new RDD that is created by the Get to HBase
A simple abstraction over the HBaseContext.foreachPartition method.
A simple abstraction over the HBaseContext.foreachPartition method.
It allow addition support for a user to take a RDD and generate increments and send them to HBase.
The complexity of managing the Connection is removed from the developer
Original RDD with data to iterate over
The name of the table to increment to
function to convert a value in the RDD to a HBase Increments
The number of increments to batch before sending to HBase
A simple abstraction over the HBaseContext.
A simple abstraction over the HBaseContext.foreachPartition method.
It allow addition support for a user to take RDD and generate puts and send them to HBase. The complexity of managing the HConnection is removed from the developer
Original RDD with data to iterate over
The name of the table to put into
Function to convert a value in the RDD to a HBase Put
If autoFlush should be turned on
A simple enrichment of the traditional Spark RDD foreachPartition.
A simple enrichment of the traditional Spark RDD foreachPartition. This function differs from the original in that it offers the developer access to a already connected Connection object
Note: Do not close the Connection object. All Connection management is handled outside this method
Original RDD with data to iterate over
Function to be given a iterator to iterate through the RDD values and a Connection object to interact with HBase
A simple enrichment of the traditional Spark Streaming dStream foreach This function differs from the original in that it offers the developer access to a already connected HConnection object
A simple enrichment of the traditional Spark Streaming dStream foreach This function differs from the original in that it offers the developer access to a already connected HConnection object
Note: Do not close the HConnection object. All HConnection management is handled outside this method
Original DStream with data to iterate over
Function to be given a iterator to iterate through the DStream values and a HConnection object to interact with HBase
A overloaded version of HBaseContext hbaseRDD that predefines the type of the outputing RDD
A overloaded version of HBaseContext hbaseRDD that predefines the type of the outputing RDD
the name of the table to scan
the HBase scan object to use to read data from HBase
New RDD with results from scan
This function will use the native HBase TableInputFormat with the given scan object to generate a new RDD
This function will use the native HBase TableInputFormat with the given scan object to generate a new RDD
the name of the table to scan
the HBase scan object to use to read data from HBase
function to convert a Result object from HBase into what the user wants in the final generated RDD
new RDD with results from scan
A simple enrichment of the traditional Spark RDD mapPartition.
A simple enrichment of the traditional Spark RDD mapPartition. This function differs from the original in that it offers the developer access to a already connected HConnection object
Note: Do not close the HConnection object. All HConnection management is handled outside this method
Note: Make sure to partition correctly to avoid memory issue when getting data from HBase
Original RDD with data to iterate over
Function to be given a iterator to iterate through the RDD values and a HConnection object to interact with HBase
Returns a new RDD generated by the user definition function just like normal mapPartition
A simple abstraction over the bulkCheckDelete method.
A simple abstraction over the bulkCheckDelete method.
It allow addition support for a user to take a DStream and generate CheckAndDelete and send them to HBase.
The complexity of managing the Connection is removed from the developer
Original DStream with data to iterate over
The name of the table to delete from
function to convert a value in the DStream to a HBase Delete
A simple abstraction over the bulkCheckAndPut method.
A simple abstraction over the bulkCheckAndPut method.
It allow addition support for a user to take a DStream and generate checkAndPuts and send them to HBase.
The complexity of managing the Connection is removed from the developer
Original DStream with data to iterate over
The name of the table to checkAndPut into
function to convert a value in the RDD to a HBase checkAndPut
If autoFlush should be turned on
A simple abstraction over the streamBulkMutation method.
A simple abstraction over the streamBulkMutation method.
It allow addition support for a user to take a DStream and generate Delete and send them to HBase.
The complexity of managing the Connection is removed from the developer
Original DStream with data to iterate over
The name of the table to delete from
function to convert a value in the DStream to a HBase Delete
The number of Deletes to batch before sending to HBase
A simple abstraction over the HBaseContext.
A simple abstraction over the HBaseContext.streamMap method.
It allow addition support for a user to take a DStream and generates a new DStream based on Gets and the results they bring back from HBase
The name of the table to get from
Original DStream with data to iterate over
function to convert a value in the DStream to a HBase Get
This will convert the HBase Result object to what ever the user wants to put in the resulting DStream return new DStream that is created by the Get to HBase
A simple abstraction over the streamBulkMutation method.
A simple abstraction over the streamBulkMutation method.
It allow addition support for a user to take a DStream and generate Increments and send them to HBase.
The complexity of managing the Connection is removed from the developer
Original DStream with data to iterate over
The name of the table to increments into
Function to convert a value in the DStream to a HBase Increments
The number of increments to batch before sending to HBase
A simple abstraction over the bulkPut method.
A simple abstraction over the bulkPut method.
It allow addition support for a user to take a DStream and generate puts and send them to HBase.
The complexity of managing the Connection is removed from the developer
Original DStream with data to iterate over
The name of the table to put into
Function to convert a value in the DStream to a HBase Put
If autoFlush should be turned on
A simple enrichment of the traditional Spark Streaming DStream mapPartition.
A simple enrichment of the traditional Spark Streaming DStream mapPartition.
This function differs from the original in that it offers the developer access to a already connected HConnection object
Note: Do not close the HConnection object. All HConnection management is handled outside this method
Note: Make sure to partition correctly to avoid memory issue when getting data from HBase
Original DStream with data to iterate over
Function to be given a iterator to iterate through the DStream values and a HConnection object to interact with HBase
Returns a new DStream generated by the user definition function just like normal mapPartition
HBaseContext is a façade of simple and complex HBase operations like bulk put, get, increment, delete, and scan
HBase Context will take the responsibilities to happen to complexity of disseminating the configuration information to the working and managing the life cycle of Connections.
serializable Configuration object