ra3
ra3 provides an embedded query language and its corresponding query engine.
ra3 is built on a distributed task execution library named tasks. Consequently almost all interactions with ra3 needs a handle for a configured tasks runtime environment represented by a value of the type tasks.TaskSystemComponents. You can configure and start the tasks environment with tasks.withTaskSystem method.
ra3 can query data on disk (or in object storage) organized into its own intermediate chunked columnar storage. A table in ra3 is represented by a value of the type ra3.Table. One can import CSV data with the ra3.importCsv method. One can export data back to CSV with the ra3.Table.exportToCsv method. The intermediate data organization is not meant for any use outside of ra3, neither for long term storage.
Each query in ra3 is persisted to secondary storage and checkpointed.
The entry points to the query language are the various methods in the ra3 package or in the ra3.Table class which provide typed references to columns or references to tables, e.g.:
- ra3.let and ra3.let0
- ra3.Table.in and ra3.Table.in0
The query language builds an expression tree of type ra3.tablelang.TableExpr, which is evaluated with the ra3.tablelang.TableExpr.evaluate into an IO[Table]. ra3.tablelang.TableExpr is a serializable description the query. The expression tree of the query may be printed in human readable form with ra3.tablelang.TableExpr.render.
The following table level operators are available in ra3:
- simple query, i.e. element-wise filter and projection. Corresponds to SQL queries of SELECT and WHERE.
- count query, i.e. element-wise filter and count. Corresponds to SQL queries of SELECT count(*) and WHERE (but no GROUP BY).
- equi-join, i.e. join with a join condition restricted to be equality among two columns.
- group by then reduce. Corresponds to SELECT aggregations, WHERE, GROUP BY clauses of SQL
- group by then count. Corresponds to SELECT count(*), WHERE, GROUP BY clauses of SQL
- approximate selection of the top K number of elements by a single column
Common features of SQL which are not available:
- arbitrary join condition (e.g. join by inequalities)
- complete products of tables (Cartesian products)
- full table sort
- sub-query in filter (WHERE in (select id FROM ..))
Partitioning. ra3 does not maintain random access indexes, but it is repartitioning (sharding / bucketizing) the needed columns for a given group by or join operator such that all the keys needed to complete the operation are in the same partition.
Language imports. You may choose to import everything from the ra3 package. It does not contain any implicits.
Attributes
Members list
Packages
Type members
Experimental classlikes
Attributes
- Experimental
- true
- Supertypes
-
trait Serializabletrait Producttrait Equalsclass Objecttrait Matchableclass AnyShow all
Attributes
- Companion
- trait
- Experimental
- true
- Supertypes
-
trait Sumtrait Mirrorclass Objecttrait Matchableclass Any
- Self type
-
CSVColumnDefinition.type
Attributes
- Companion
- trait
- Experimental
- true
- Supertypes
-
trait Sumtrait Mirrorclass Objecttrait Matchableclass Any
- Self type
-
CharacterDecoder.type
Attributes
- Companion
- trait
- Experimental
- true
- Supertypes
-
trait Sumtrait Mirrorclass Objecttrait Matchableclass Any
- Self type
-
CompressionFormat.type
Attributes
- Experimental
- true
- Supertypes
-
class Objecttrait Matchableclass Any
- Self type
-
Fnv1.type
Enum for predefined formats parsing string to Instant
Enum for predefined formats parsing string to Instant
Attributes
- Companion
- object
- Experimental
- true
- Supertypes
-
class Objecttrait Matchableclass Any
- Known subtypes
-
object ISOclass LocalDateTimeAtUTC
Attributes
- Companion
- trait
- Experimental
- true
- Supertypes
-
trait Sumtrait Mirrorclass Objecttrait Matchableclass Any
- Self type
-
InstantFormat.type
Attributes
- Companion
- object
- Experimental
- true
- Supertypes
-
class Objecttrait Matchableclass Any
- Known subtypes
-
class LocalDateTimeAtUTC
Attributes
- Companion
- trait
- Experimental
- true
- Supertypes
-
class Objecttrait Matchableclass Any
- Self type
-
InstantParser.type
Attributes
- Experimental
- true
- Supertypes
-
class Objecttrait Matchableclass Any
- Self type
-
Murmur3.type
Scala hack to represent generic types which are not Nothing
Scala hack to represent generic types which are not Nothing
This is a type class with two ambiguous instances predefined for Nothing
Attributes
- Companion
- object
- Experimental
- true
- Supertypes
-
class Objecttrait Matchableclass Any
Attributes
- Companion
- class
- Experimental
- true
- Supertypes
-
class Objecttrait Matchableclass Any
- Self type
-
NotNothing.type
Reference to a set of aligned columns (i.e. a table) persisted onto secondary storage.
Reference to a set of aligned columns (i.e. a table) persisted onto secondary storage.
Each table must have a unique identifier, initially given by the importCsv method.
Tables have String column names.
Tables consists of columns. Columns are stored as segments. Segments are the unit of IO operations, i.e. ra3 never reads less then a segment into memory. The in memory (buffered) counterpart of a segment is a Buffer. The maximum number of elements in a segment is thus what is readable into a single java array, that is shortly below 2^31.
Each column of the same table has the same segmentation, i.e. they have the same number of segments and their segments have the same size and those segments are aligned.
Segments store segment level statistics and some operations complete withour buffering the segment.
Attributes
- Companion
- object
- Experimental
- true
- Supertypes
-
trait Serializabletrait Producttrait Equalsclass Objecttrait Matchableclass AnyShow all
Types
Value members
Experimental methods
Attributes
- Experimental
- true
Attributes
- Experimental
- true
Concatenate the list of rows of multiple tables ('grows downwards')
Concatenate the list of rows of multiple tables ('grows downwards')
Attributes
- Experimental
- true
Count query consisting of elementwise (row-wise) filter and counting those rows which pass the filter
Count query consisting of elementwise (row-wise) filter and counting those rows which pass the filter
Attributes
- Experimental
- true
Attributes
- Experimental
- true
Import CSV data into ra3
Import CSV data into ra3
Value parameters
- columns
-
Description of columns: at a minimum the 0-based column index in the csv file and the type of the column
- maxSegmentLength
-
Each column will be chunked to this length
- name
-
Name of the table to create, must be unique
Attributes
- Experimental
- true
Import CSV data into ra3
Import CSV data into ra3
Value parameters
- columns
-
Description of columns: at a minimum the 0-based column index in the csv file and the type of the column
- maxSegmentLength
-
Each column will be chunked to this length
- name
-
Name of the table to create, must be unique
Attributes
- Experimental
- true
Attributes
- Experimental
- true
Partial reduction
Partial reduction
Reduces each segment independently. Returns a single row per segment.
Attributes
- Experimental
- true
Simple query consisting of elementwise (row-wise) projection and filter
Simple query consisting of elementwise (row-wise) projection and filter
Attributes
- Experimental
- true
Full table reduction
Full table reduction
Equivalent to a group by into a single group, then reducing that single group. Returns a single row.
This will read all rows of the needed columns into memory. You may want to consult with partialReduce if the reduction is distributable.
Attributes
- Experimental
- true
Attributes
- Experimental
- true
Simple query consisting of elementwise (row-wise) projection and filter
Simple query consisting of elementwise (row-wise) projection and filter
Attributes
- Experimental
- true
Elementwise or group wise projection
Elementwise or group wise projection
Attributes
- Experimental
- true
Attributes
- Experimental
- true
Experimental fields
The value which encodes a missing string. It is the string of length 1, with content of the \u0001 character.
The value which encodes a missing string. It is the string of length 1, with content of the \u0001 character.
Attributes
- Experimental
- true
Implicits
Experimental implicits
Attributes
- Experimental
- true
Attributes
- Experimental
- true
Attributes
- Experimental
- true
Attributes
- Experimental
- true
Attributes
- Experimental
- true