case class HBaseRelation(parameters: Map[String, String], userSpecifiedSchema: Option[StructType])(sqlContext: SQLContext) extends BaseRelation with PrunedFilteredScan with InsertableRelation with Logging with Product with Serializable
Implementation of Spark BaseRelation that will build up our scan logic , do the scan pruning, filter push down, and value conversions
- sqlContext
SparkSQL context
- Annotations
- @Private()
- Alphabetic
- By Inheritance
- HBaseRelation
- Serializable
- Serializable
- Product
- Equals
- Logging
- InsertableRelation
- PrunedFilteredScan
- BaseRelation
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
HBaseRelation(parameters: Map[String, String], userSpecifiedSchema: Option[StructType])(sqlContext: SQLContext)
- sqlContext
SparkSQL context
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
- val batchNum: Int
- val blockCacheEnable: Boolean
- def buildPushDownPredicatesResource(filters: Array[Filter]): (RowKeyFilter, DynamicLogicExpression, Array[Array[Byte]])
- def buildRow(fields: Seq[Field], result: Result): Row
-
def
buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row]
Here we are building the functionality to populate the resulting RDD[Row] Here is where we will do the following: - Filter push down - Scan or GetList pruning - Executing our scan(s) or/and GetList to generate result
Here we are building the functionality to populate the resulting RDD[Row] Here is where we will do the following: - Filter push down - Scan or GetList pruning - Executing our scan(s) or/and GetList to generate result
- requiredColumns
The columns that are being requested by the requesting query
- filters
The filters that are being applied by the requesting query
- returns
RDD will all the results from HBase needed for SparkSQL to execute the query on
- Definition Classes
- HBaseRelation → PrunedFilteredScan
- val bulkGetSize: Int
- val cacheSize: Int
- val catalog: HBaseTableCatalog
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
- val clusteringCfColumnsMap: Map[String, Seq[String]]
- val configResources: String
- def createNamespaceIfNotExist(connection: Admin, namespace: String): Boolean
- def createTable(): Unit
- val darwinConf: Option[Config]
- val encoder: BytesEncoder
- val encoderClsName: String
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def getIndexedProjections(requiredColumns: Array[String]): Seq[(Field, Int)]
- def hbaseConf: Configuration
- val hbaseContext: HBaseContext
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
insert(data: DataFrame, overwrite: Boolean): Unit
- Definition Classes
- HBaseRelation → InsertableRelation
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- val maxTimestamp: Option[Long]
- val maxVersions: Option[Int]
- val minTimestamp: Option[Long]
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
needConversion: Boolean
- Definition Classes
- BaseRelation
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- val parameters: Map[String, String]
-
def
parseRowKey(row: Array[Byte], keyFields: Seq[Field]): Map[Field, Any]
Takes a HBase Row object and parses all of the fields from it.
Takes a HBase Row object and parses all of the fields from it. This is independent of which fields were requested from the key Because we have all the data it's less complex to parse everything.
- row
the retrieved row from hbase.
- keyFields
all of the fields in the row key, ORDERED by their order in the row key.
-
val
schema: StructType
Generates a Spark SQL schema objeparametersct so Spark SQL knows what is being provided by this BaseRelation
Generates a Spark SQL schema objeparametersct so Spark SQL knows what is being provided by this BaseRelation
- returns
schema generated from the SCHEMA_COLUMNS_MAPPING_KEY value
- Definition Classes
- HBaseRelation → BaseRelation
-
def
sizeInBytes: Long
- Definition Classes
- BaseRelation
-
val
sqlContext: SQLContext
- Definition Classes
- HBaseRelation → BaseRelation
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
- def tableName: String
- val timestamp: Option[Long]
-
def
transverseFilterTree(parentRowKeyFilter: RowKeyFilter, valueArray: MutableList[Array[Byte]], filter: Filter): DynamicLogicExpression
For some codec, the order may be inconsistent between java primitive type and its byte array.
For some codec, the order may be inconsistent between java primitive type and its byte array. We may have to split the predicates on some of the java primitive type into multiple predicates. The encoder will take care of it and returning the concrete ranges.
For example in naive codec, some of the java primitive types have to be split into multiple predicates, and union these predicates together to make the predicates be performed correctly. For example, if we have "COLUMN < 2", we will transform it into "0 <= COLUMN < 2 OR Integer.MIN_VALUE <= COLUMN <= -1"
-
def
unhandledFilters(filters: Array[Filter]): Array[Filter]
- Definition Classes
- BaseRelation
- val useHBaseContext: Boolean
- val usePushDownColumnFilter: Boolean
- val useSchemaAvroManager: Boolean
- val userSpecifiedSchema: Option[StructType]
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
- val wrappedConf: SerializableConfiguration