Class

org.apache.spark.sql.execution.columnar.impl

BaseColumnFormatRelation

Related Doc: package impl

Permalink

abstract class BaseColumnFormatRelation extends JDBCAppendableRelation with PartitionedDataSourceScan with RowInsertableRelation with MutableRelation

This class acts as a DataSource provider for column format tables provided Snappy. It uses GemFireXD as actual datastore to physically locate the tables. Column tables can be used for storing data in columnar compressed format. A example usage is given below.

val data = Seq(Seq(1, 2, 3), Seq(7, 8, 9), Seq(9, 2, 3), Seq(4, 2, 3), Seq(5, 6, 7)) val rdd = sc.parallelize(data, data.length).map(s => new Data(s(0), s(1), s(2))) val dataDF = snc.createDataFrame(rdd) snc.createTable(tableName, "column", dataDF.schema, props) dataDF.write.insertInto(tableName)

This provider scans underlying tables in parallel and is aware of the data partition. It does not introduces a shuffle if simple table query is fired. One can insert a single or multiple rows into this table as well as do a bulk insert by a Spark DataFrame. Bulk insert example is shown above.

Linear Supertypes
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. BaseColumnFormatRelation
  2. MutableRelation
  3. RowInsertableRelation
  4. SingleRowInsertableRelation
  5. PartitionedDataSourceScan
  6. JDBCAppendableRelation
  7. Product
  8. Equals
  9. Serializable
  10. Serializable
  11. Logging
  12. IndexableRelation
  13. PlanInsertableRelation
  14. DestroyRelation
  15. InsertableRelation
  16. PrunedUnsafeFilteredScan
  17. BaseRelation
  18. AnyRef
  19. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new BaseColumnFormatRelation(_table: String, _provider: String, _mode: SaveMode, _userSchema: StructType, schemaExtensions: String, ddlExtensionForShadowTable: String, _origOptions: Map[String, String], _externalStore: ExternalStore, partitioningColumns: Seq[String], _context: SQLContext)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def buildRowBufferRDD(partitionEvaluator: () ⇒ Array[Partition], requiredColumns: Array[String], filters: Array[Filter], useResultSet: Boolean): RDD[Any]

    Permalink
  6. def buildUnsafeScan(requiredColumns: Array[String], filters: Array[Filter]): (RDD[Any], Seq[RDD[InternalRow]])

    Permalink
  7. def buildUnsafeScanForSampledRelation(requiredColumns: Array[String], filters: Array[Filter]): (RDD[Any], RDD[Any], Seq[RDD[InternalRow]])

    Permalink
  8. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  9. lazy val clusterMode: ClusterMode

    Permalink
  10. final val connFactory: () ⇒ Connection

    Permalink
    Attributes
    protected
    Definition Classes
    JDBCAppendableRelation
  11. final val connProperties: ConnectionProperties

    Permalink
    Attributes
    protected
    Definition Classes
    JDBCAppendableRelation
  12. val connectionType: ConnectionType.Value

    Permalink
  13. def createExternalTableForColumnBatches(tableName: String, externalStore: ExternalStore): Unit

    Permalink

    Table definition: create table columnTable ( id varchar(36) not null, partitionId integer, numRows integer not null, data blob) For a table with n columns, there will be n+1 region entries.

    Table definition: create table columnTable ( id varchar(36) not null, partitionId integer, numRows integer not null, data blob) For a table with n columns, there will be n+1 region entries. A base entry and one entry each for a column. The data column for the base entry will contain the stats. id for the base entry would be the uuid while for column entries it would be uuid_colName.

    Definition Classes
    BaseColumnFormatRelationJDBCAppendableRelation
  14. def createIndex(indexIdent: QualifiedTableName, tableIdent: QualifiedTableName, indexColumns: Map[String, Option[SortDirection]], options: Map[String, String]): Unit

    Permalink

    Create an index on a table.

    Create an index on a table.

    indexIdent

    Index Identifier which goes in the catalog

    tableIdent

    Table identifier on which the index is created.

    indexColumns

    Columns on which the index has to be created with the direction of sorting. Direction can be specified as None.

    options

    Options for indexes. For e.g. column table index - ("COLOCATE_WITH"->"CUSTOMER"). row table index - ("INDEX_TYPE"->"GLOBAL HASH") or ("INDEX_TYPE"->"UNIQUE")

    Definition Classes
    JDBCAppendableRelationIndexableRelation
  15. def createTable(mode: SaveMode): Unit

    Permalink
  16. def createTable(externalStore: ExternalStore, tableStr: String, tableName: String, dropIfExists: Boolean): Unit

    Permalink
    Definition Classes
    JDBCAppendableRelation
  17. val ddlExtensionForShadowTable: String

    Permalink
  18. def destroy(ifExists: Boolean): Unit

    Permalink

    Destroy and cleanup this relation.

    Destroy and cleanup this relation. It may include, but not limited to, dropping the external table that this relation represents.

    Definition Classes
    BaseColumnFormatRelationJDBCAppendableRelationDestroyRelation
  19. final def dialect: JdbcDialect

    Permalink
    Attributes
    protected
    Definition Classes
    JDBCAppendableRelation
  20. def dropIndex(indexIdent: QualifiedTableName, tableIdent: QualifiedTableName, ifExists: Boolean): Unit

    Permalink

    Drops an index on this table

    Drops an index on this table

    indexIdent

    Index identifier

    tableIdent

    Table identifier

    ifExists

    Drop if exists

    Definition Classes
    JDBCAppendableRelationIndexableRelation
  21. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  22. def executeUpdate(sql: String): Int

    Permalink

    Execute a DML SQL and return the number of rows affected.

    Execute a DML SQL and return the number of rows affected.

    Definition Classes
    BaseColumnFormatRelationSingleRowInsertableRelation
  23. val externalStore: ExternalStore

    Permalink
    Definition Classes
    JDBCAppendableRelation
  24. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  25. def flushRowBuffer(): Unit

    Permalink
  26. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  27. def getColumnBatchParams: (Int, Int, String)

    Permalink
    Definition Classes
    JDBCAppendableRelation
  28. def getColumnBatchStatistics(schema: Seq[AttributeReference]): PartitionStatistics

    Permalink
  29. def getDeletePlan(relation: LogicalRelation, child: SparkPlan, keyColumns: Seq[Attribute]): SparkPlan

    Permalink

    Get a spark plan to delete rows the relation.

    Get a spark plan to delete rows the relation. The result of SparkPlan execution should be a count of number of updated rows.

    Definition Classes
    BaseColumnFormatRelationMutableRelation
  30. def getInsertPlan(relation: LogicalRelation, child: SparkPlan): SparkPlan

    Permalink

    Get a spark plan for insert.

    Get a spark plan for insert. The result of SparkPlan execution should be a count of number of inserted rows.

    Definition Classes
    BaseColumnFormatRelationJDBCAppendableRelationPlanInsertableRelation
  31. def getKeyColumns: Seq[String]

    Permalink

    Get the "key" columns for the table that need to be projected out by UPDATE and DELETE operations for affecting the selected rows.

    Get the "key" columns for the table that need to be projected out by UPDATE and DELETE operations for affecting the selected rows.

    Definition Classes
    BaseColumnFormatRelationMutableRelation
  32. def getUpdatePlan(relation: LogicalRelation, child: SparkPlan, updateColumns: Seq[Attribute], updateExpressions: Seq[Expression], keyColumns: Seq[Attribute]): SparkPlan

    Permalink

    Get a spark plan to update rows in the relation.

    Get a spark plan to update rows in the relation. The result of SparkPlan execution should be a count of number of updated rows.

    Definition Classes
    BaseColumnFormatRelationMutableRelation
  33. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  34. def insert(rows: Seq[Row]): Int

    Permalink

    Insert a sequence of rows into the table represented by this relation.

    Insert a sequence of rows into the table represented by this relation.

    rows

    the rows to be inserted

    returns

    number of rows inserted

    Definition Classes
    BaseColumnFormatRelationRowInsertableRelation
  35. def insert(data: DataFrame, overwrite: Boolean): Unit

    Permalink
    Definition Classes
    JDBCAppendableRelation → InsertableRelation
  36. final def isDebugEnabled: Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  37. final def isInfoEnabled: Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  38. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  39. lazy val isPartitioned: Boolean

    Permalink
  40. final def isTraceEnabled: Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  41. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  42. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  43. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  44. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  45. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  46. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  47. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  48. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  49. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  50. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  51. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  52. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  53. val mode: SaveMode

    Permalink
    Definition Classes
    JDBCAppendableRelation
  54. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  55. val needConversion: Boolean

    Permalink
    Definition Classes
    JDBCAppendableRelation → BaseRelation
  56. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  57. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  58. lazy val numBuckets: Int

    Permalink
  59. val origOptions: Map[String, String]

    Permalink
    Definition Classes
    JDBCAppendableRelation
  60. def partitionColumns: Seq[String]

    Permalink

    Get the partitioning columns for the table, if any.

    Get the partitioning columns for the table, if any.

    Definition Classes
    BaseColumnFormatRelationMutableRelationPartitionedDataSourceScan
  61. val partitioningColumns: Seq[String]

    Permalink
  62. val provider: String

    Permalink
    Definition Classes
    JDBCAppendableRelation
  63. lazy val region: PartitionedRegion

    Permalink
  64. lazy val relInfo: RelationInfo

    Permalink
  65. val resolvedName: String

    Permalink
    Definition Classes
    JDBCAppendableRelation
  66. lazy val rowInsertStr: String

    Permalink
  67. def scanTable(tableName: String, requiredColumns: Array[String], filters: Array[Filter], _ignore: ⇒ Int): RDD[Any]

    Permalink
  68. val schema: StructType

    Permalink
    Definition Classes
    JDBCAppendableRelation → BaseRelation
  69. val schemaExtensions: String

    Permalink
  70. def sizeInBytes: Long

    Permalink
    Definition Classes
    JDBCAppendableRelation → BaseRelation
  71. val sqlContext: SQLContext

    Permalink
    Definition Classes
    JDBCAppendableRelation → BaseRelation
  72. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  73. val table: String

    Permalink
    Definition Classes
    JDBCAppendableRelation
  74. var tableExists: Boolean

    Permalink

    Return true if table already existed when the relation object was created.

    Return true if table already existed when the relation object was created.

    Definition Classes
    JDBCAppendableRelationDestroyRelation
  75. def toString(): String

    Permalink
    Definition Classes
    BaseColumnFormatRelationJDBCAppendableRelation → AnyRef → Any
  76. def truncate(): Unit

    Permalink

    Truncate the table represented by this relation.

    Truncate the table represented by this relation.

    Definition Classes
    BaseColumnFormatRelationJDBCAppendableRelationDestroyRelation
  77. def unhandledFilters(filters: Array[Filter]): Array[Filter]

    Permalink
    Definition Classes
    BaseRelation
  78. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  79. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  80. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  81. def withKeyColumns(relation: LogicalRelation, keyColumns: Seq[String]): LogicalRelation

    Permalink

    If required inject the key columns in the original relation.

    If required inject the key columns in the original relation.

    Definition Classes
    MutableRelation

Inherited from MutableRelation

Inherited from RowInsertableRelation

Inherited from PartitionedDataSourceScan

Inherited from JDBCAppendableRelation

Inherited from Product

Inherited from Equals

Inherited from Serializable

Inherited from Serializable

Inherited from Logging

Inherited from IndexableRelation

Inherited from PlanInsertableRelation

Inherited from DestroyRelation

Inherited from InsertableRelation

Inherited from PrunedUnsafeFilteredScan

Inherited from BaseRelation

Inherited from AnyRef

Inherited from Any

Ungrouped