Rule to collapse the partial and final aggregates if the grouping keys match or are superset of the child distribution.
Rule to collapse the partial and final aggregates if the grouping keys match or are superset of the child distribution. Also introduces exchange when inserting into a partitioned table if number of partitions don't match.
RDD that delegates calls to the base RDD.
RDD that delegates calls to the base RDD. However the dependencies and preferred locations of this RDD can be altered.
Rule to insert a helper plan to collect information for other entities like parameterized literals.
The local mode which hosts the data, executor, driver (and optionally even jobserver) all in the same node.
Encapsulates result of a partition having data and number of rows.
Encapsulates result of a partition having data and number of rows.
Note: this uses an optimized external serializer for PooledKryoSerializer so any changes to this class need to be reflected in the serializer.
Used to plan the aggregate operator for expressions using the optimized SnappyData aggregation operators.
Used to plan the aggregate operator for expressions using the optimized SnappyData aggregation operators.
Adapted from Spark's Aggregation strategy.
Base parsing facilities for all SnappyData SQL parsers.
Main entry point for SnappyData extensions to Spark.
Main entry point for SnappyData extensions to Spark. A SnappyContext extends Spark's org.apache.spark.sql.SQLContext to work with Row and Column tables. Any DataFrame can be managed as SnappyData tables and any table can be accessed as a DataFrame. This integrates the SQLContext functionality with the Snappy store.
When running in the embedded mode (i.e. Spark executor collocated with Snappy data store), Applications typically submit Jobs to the Snappy-JobServer (provide link) and do not explicitly create a SnappyContext. A single shared context managed by SnappyData makes it possible to re-use Executors across client connections or applications.
SnappyContext uses a HiveMetaStore for catalog , which is persistent. This enables table metadata info recreated on driver restart.
User should use obtain reference to a SnappyContext instance as below val snc: SnappyContext = SnappyContext.getOrCreate(sparkContext)
Provide links to above descriptions
,document describing the Job server API
https://github.com/SnappyDataInc/snappydata#interacting-with-snappydata
https://github.com/SnappyDataInc/snappydata#step-1---start-the-snappydata-cluster
The regular snappy cluster where each node is both a Spark executor as well as GemFireXD data store.
The regular snappy cluster where each node is both a Spark executor as well as GemFireXD data store. There is a "lead node" which is the Spark driver that also hosts a job-server and GemFireXD accessor.
This is for the two cluster mode: one is the normal snappy cluster, and this one is a separate local/Spark/Yarn/Mesos cluster fetching data from the snappy cluster on demand that just remains like an external datastore.
Manages a time epoch and how to index into it.
Plans scalar subqueries like the Spark's PlanSubqueries but uses customized ScalarSubquery to insert a tokenized literal instead of literal value embedded in code to allow generated code re-use and improve performance substantially.
Rule to replace Spark's SortExec plans with an optimized SnappySortExec (in SMJ for now).
Implicit conversions used by Snappy.