HBaseRelation

Instance Constructors

new HBaseRelation(parameters: Map[String, String], userSpecifiedSchema: Option[StructType])(sqlContext: SQLContext)

sqlContext
SparkSQL context

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
val batchNum: Int
val blockCacheEnable: Boolean
def buildPushDownPredicatesResource(filters: Array[Filter]): (RowKeyFilter, DynamicLogicExpression, Array[Array[Byte]])
def buildRow(fields: Seq[Field], result: Result): Row
def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row]

Here we are building the functionality to populate the resulting RDD[Row] Here is where we will do the following: - Filter push down - Scan or GetList pruning - Executing our scan(s) or/and GetList to generate result
Here we are building the functionality to populate the resulting RDD[Row] Here is where we will do the following: - Filter push down - Scan or GetList pruning - Executing our scan(s) or/and GetList to generate result
requiredColumns
The columns that are being requested by the requesting query
filters
The filters that are being applied by the requesting query
returns
RDD will all the results from HBase needed for SparkSQL to execute the query on

Definition Classes
HBaseRelation → PrunedFilteredScan
val bulkGetSize: Int
val cacheSize: Int
val catalog: HBaseTableCatalog
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
val clusteringCfColumnsMap: Map[String, Seq[String]]
val configResources: String
def createNamespaceIfNotExist(connection: Admin, namespace: String): Boolean
def createTable(): Unit
val darwinConf: Option[Config]
val encoder: BytesEncoder
val encoderClsName: String
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def getIndexedProjections(requiredColumns: Array[String]): Seq[(Field, Int)]
def hbaseConf: Configuration
val hbaseContext: HBaseContext
def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

Attributes
protected
Definition Classes
Logging
def initializeLogIfNecessary(isInterpreter: Boolean): Unit

Attributes
protected
Definition Classes
Logging
def insert(data: DataFrame, overwrite: Boolean): Unit

Definition Classes
HBaseRelation → InsertableRelation
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def isTraceEnabled(): Boolean

Attributes
protected
Definition Classes
Logging
def log: Logger

Attributes
protected
Definition Classes
Logging
def logDebug(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logDebug(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logName: String

Attributes
protected
Definition Classes
Logging
def logTrace(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logTrace(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
val maxTimestamp: Option[Long]
val maxVersions: Option[Int]
val minTimestamp: Option[Long]
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def needConversion: Boolean

Definition Classes
BaseRelation
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
val parameters: Map[String, String]
def parseRowKey(row: Array[Byte], keyFields: Seq[Field]): Map[Field, Any]

Takes a HBase Row object and parses all of the fields from it.
Takes a HBase Row object and parses all of the fields from it. This is independent of which fields were requested from the key Because we have all the data it's less complex to parse everything.
row
the retrieved row from hbase.
keyFields
all of the fields in the row key, ORDERED by their order in the row key.
val schema: StructType

Generates a Spark SQL schema objeparametersct so Spark SQL knows what is being provided by this BaseRelation
Generates a Spark SQL schema objeparametersct so Spark SQL knows what is being provided by this BaseRelation
returns
schema generated from the SCHEMA_COLUMNS_MAPPING_KEY value

Definition Classes
HBaseRelation → BaseRelation
def sizeInBytes: Long

Definition Classes
BaseRelation
val sqlContext: SQLContext

SparkSQL context
SparkSQL context

Definition Classes
HBaseRelation → BaseRelation
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def tableName: String
val timestamp: Option[Long]
def transverseFilterTree(parentRowKeyFilter: RowKeyFilter, valueArray: MutableList[Array[Byte]], filter: Filter): DynamicLogicExpression

For some codec, the order may be inconsistent between java primitive type and its byte array.
For some codec, the order may be inconsistent between java primitive type and its byte array. We may have to split the predicates on some of the java primitive type into multiple predicates. The encoder will take care of it and returning the concrete ranges.
For example in naive codec, some of the java primitive types have to be split into multiple predicates, and union these predicates together to make the predicates be performed correctly. For example, if we have "COLUMN < 2", we will transform it into "0 <= COLUMN < 2 OR Integer.MIN_VALUE <= COLUMN <= -1"
def unhandledFilters(filters: Array[Filter]): Array[Filter]

Definition Classes
BaseRelation
val useHBaseContext: Boolean
val usePushDownColumnFilter: Boolean
val useSchemaAvroManager: Boolean
val userSpecifiedSchema: Option[StructType]
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
val wrappedConf: SerializableConfiguration

Related Doc: package spark

case class HBaseRelation(parameters: Map[String, String], userSpecifiedSchema: Option[StructType])(sqlContext: SQLContext) extends BaseRelation with PrunedFilteredScan with InsertableRelation with Logging with Product with Serializable

Instance Constructors

new HBaseRelation(parameters: Map[String, String], userSpecifiedSchema: Option[StructType])(sqlContext: SQLContext)

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

val batchNum: Int

val blockCacheEnable: Boolean

def buildPushDownPredicatesResource(filters: Array[Filter]): (RowKeyFilter, DynamicLogicExpression, Array[Array[Byte]])

def buildRow(fields: Seq[Field], result: Result): Row

def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row]

val bulkGetSize: Int

val cacheSize: Int

val catalog: HBaseTableCatalog

def clone(): AnyRef

val clusteringCfColumnsMap: Map[String, Seq[String]]

val configResources: String

def createNamespaceIfNotExist(connection: Admin, namespace: String): Boolean

def createTable(): Unit

val darwinConf: Option[Config]

val encoder: BytesEncoder

val encoderClsName: String

final def eq(arg0: AnyRef): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def getIndexedProjections(requiredColumns: Array[String]): Seq[(Field, Int)]

def hbaseConf: Configuration

val hbaseContext: HBaseContext

def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

def initializeLogIfNecessary(isInterpreter: Boolean): Unit

def insert(data: DataFrame, overwrite: Boolean): Unit

final def isInstanceOf[T0]: Boolean

def isTraceEnabled(): Boolean

def log: Logger

def logDebug(msg: ⇒ String, throwable: Throwable): Unit

def logDebug(msg: ⇒ String): Unit

def logError(msg: ⇒ String, throwable: Throwable): Unit

def logError(msg: ⇒ String): Unit

def logInfo(msg: ⇒ String, throwable: Throwable): Unit

def logInfo(msg: ⇒ String): Unit

def logName: String

def logTrace(msg: ⇒ String, throwable: Throwable): Unit

def logTrace(msg: ⇒ String): Unit

def logWarning(msg: ⇒ String, throwable: Throwable): Unit

def logWarning(msg: ⇒ String): Unit

val maxTimestamp: Option[Long]

val maxVersions: Option[Int]

val minTimestamp: Option[Long]

final def ne(arg0: AnyRef): Boolean

def needConversion: Boolean

final def notify(): Unit

final def notifyAll(): Unit

val parameters: Map[String, String]

def parseRowKey(row: Array[Byte], keyFields: Seq[Field]): Map[Field, Any]

val schema: StructType

def sizeInBytes: Long

val sqlContext: SQLContext

final def synchronized[T0](arg0: ⇒ T0): T0

def tableName: String

val timestamp: Option[Long]

def transverseFilterTree(parentRowKeyFilter: RowKeyFilter, valueArray: MutableList[Array[Byte]], filter: Filter): DynamicLogicExpression

def unhandledFilters(filters: Array[Filter]): Array[Filter]

val useHBaseContext: Boolean

val usePushDownColumnFilter: Boolean

val useSchemaAvroManager: Boolean

val userSpecifiedSchema: Option[StructType]

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

val wrappedConf: SerializableConfiguration

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from Logging

Inherited from InsertableRelation