Table definition: create table columnTable ( id varchar(36) not null, partitionId integer, numRows integer not null, data blob) For a table with n columns, there will be n+1 region entries.
Table definition: create table columnTable ( id varchar(36) not null, partitionId integer, numRows integer not null, data blob) For a table with n columns, there will be n+1 region entries. A base entry and one entry each for a column. The data column for the base entry will contain the stats. id for the base entry would be the uuid while for column entries it would be uuid_colName.
Create an index on a table.
Create an index on a table.
Index Identifier which goes in the catalog
Table identifier on which the index is created.
Columns on which the index has to be created with the direction of sorting. Direction can be specified as None.
Options for indexes. For e.g. column table index - ("COLOCATE_WITH"->"CUSTOMER"). row table index - ("INDEX_TYPE"->"GLOBAL HASH") or ("INDEX_TYPE"->"UNIQUE")
Destroy and cleanup this relation.
Destroy and cleanup this relation. It may include, but not limited to, dropping the external table that this relation represents.
Drops an index on this table
Drops an index on this table
Index identifier
Table identifier
Drop if exists
Execute a DML SQL and return the number of rows affected.
Execute a DML SQL and return the number of rows affected.
Get a spark plan to delete rows the relation.
Get a spark plan to delete rows the relation. The result of SparkPlan execution should be a count of number of updated rows.
Get a spark plan for insert.
Get a spark plan for insert. The result of SparkPlan execution should be a count of number of inserted rows.
Get the "key" columns for the table that need to be projected out by UPDATE and DELETE operations for affecting the selected rows.
Get the "key" columns for the table that need to be projected out by UPDATE and DELETE operations for affecting the selected rows.
Get a spark plan to update rows in the relation.
Get a spark plan to update rows in the relation. The result of SparkPlan execution should be a count of number of updated rows.
Insert a sequence of rows into the table represented by this relation.
Insert a sequence of rows into the table represented by this relation.
the rows to be inserted
number of rows inserted
Get the partitioning columns for the table, if any.
Get the partitioning columns for the table, if any.
Return true if table already existed when the relation object was created.
Return true if table already existed when the relation object was created.
Truncate the table represented by this relation.
Truncate the table represented by this relation.
If required inject the key columns in the original relation.
If required inject the key columns in the original relation.
This class acts as a DataSource provider for column format tables provided Snappy. It uses GemFireXD as actual datastore to physically locate the tables. Column tables can be used for storing data in columnar compressed format. A example usage is given below.
val data = Seq(Seq(1, 2, 3), Seq(7, 8, 9), Seq(9, 2, 3), Seq(4, 2, 3), Seq(5, 6, 7)) val rdd = sc.parallelize(data, data.length).map(s => new Data(s(0), s(1), s(2))) val dataDF = snc.createDataFrame(rdd) snc.createTable(tableName, "column", dataDF.schema, props) dataDF.write.insertInto(tableName)
This provider scans underlying tables in parallel and is aware of the data partition. It does not introduces a shuffle if simple table query is fired. One can insert a single or multiple rows into this table as well as do a bulk insert by a Spark DataFrame. Bulk insert example is shown above.