org.sparklinedata.druid.metadata

StarSchema

case class StarSchema(info: StarSchemaInfo, factTable: StarTable, tableMap: Map[String, StarTable], attrMap: Map[String, StarTable]) extends Product with Serializable

Represents a StarSchema. The Star Schemas we support have the following constraints:

The first 2 points are not an issue only in the most involved star schema models; for e.g. we show how tpch can be modeled below. The 3rd restriction is an implementation issue: when performing QueryPlan rewrites we don't have access to the table an Attribute belongs to, for now we get around this issue by forcing column names to be unique across the Star Schema.

Tpch Model:

FactTable = LineItem
StarRelations: [
  LineItem - n:1 - Order => [[li_orderkey],[o_orderkey]]
  LineItem - n:1 - PartSupp => [[li_partkey, li_suppkey],[ps_partkey, ps_suppkey]]
  Order - n: 1 - Customer => [[o_custkey], [c_custkey]]
  PartSupp - n:1 - Part => [[ps_partkey], [p_partkey]]
  PartSupp - n:1 - Supplier => [[ps_suppkey], [s_suppkey]]
  Customer - n:1 - CustNation => [[c_nationkey], [cn_nationkey]]
  CustNation - n:1 - CustRegion => [[cn_regionkey], [cr_regionkey]]
  Supplier - n:1 - SupptNation => [[s_nationkey], [sn_nationkey]]
  SuppNation - n:1 - SuppRegion => [[sn_regionkey], [sr_regionkey]]
]

Because of our restrictions we have had to model the Nation table as separate CustNation and SuppNation tables. Similar separation has to be done for CustRegion and SuppRegion. Having to setup separate entities for Supplier and Customer Nation is not atypical when directly writing SQLs; these would be views on the same Nation Dimension table. Currently we are being more restrictive than this, we require the 2 views to be tables in the Metastore(this is because during Plan rewrite we loose the Table association in Attributereferences. But note, this doesn't require the data to be copied, both tables can point to the same underlying data in the storage layer.

We have to rename the column names in the 2 Nation(and region) tables. This is so that we can infer the Attribute to Tables(in the Star Schema) associations in a Query Plan.

info

the StarSchemaInfo used to build this StarSchema Graph.

factTable

the node that represents the Fact Table

tableMap

maps a tableName to the StarTable node in the StarSchema Graph.

attrMap

provides a mapping from a columnName to its table.

Linear Supertypes
Serializable, Serializable, Product, Equals, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. StarSchema
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. AnyRef
  7. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new StarSchema(info: StarSchemaInfo, factTable: StarTable, tableMap: Map[String, StarTable], attrMap: Map[String, StarTable])

    info

    the StarSchemaInfo used to build this StarSchema Graph.

    factTable

    the node that represents the Fact Table

    tableMap

    maps a tableName to the StarTable node in the StarSchema Graph.

    attrMap

    provides a mapping from a columnName to its table.

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. val attrMap: Map[String, StarTable]

    provides a mapping from a columnName to its table.

  8. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  9. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  10. val factTable: StarTable

    the node that represents the Fact Table

  11. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  12. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  13. def getUniqueTable(joinKeys: Seq[Expression]): Option[String]

    The seq of expressions representing one side of a join must all be AttributeReferences and must be from the same table.

    The seq of expressions representing one side of a join must all be AttributeReferences and must be from the same table. If this condition is met, the table's name is returned.

    joinKeys
    returns

  14. val info: StarSchemaInfo

    the StarSchemaInfo used to build this StarSchema Graph.

  15. def isAttributeReference(e: Expression): Boolean

  16. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  17. def isStarJoin(leftJoinKeys: Seq[Expression], rightJoinKeys: Seq[Expression]): Option[(String, String)]

    Does the join predicate represented by the left and right join keys match a join in the StarSchema.

    Does the join predicate represented by the left and right join keys match a join in the StarSchema. So a join like

    lineitem li join part p on li.l_partkey = p.p_partkey

    is represented as

    Seq(AttributeReference("l_partkey")), Seq(AttributeReference("p_partkey"))

    The following constraints must be met for the joining condition to be a join from this StarSchema:

    • Every joining expressions can only be an AttributeReference
    • each set of joining conditions(leftJoinKeys, rightJoinKeys) must be on 1 table.
    • the 2 tables must be related in the StarSchema.
    • the matching Attributes in the input(leftJoinKeys, rightJoinKeys) must exactly nmatch the joining key defined in the StarSchema for the 2 tables involved.

    lineitem li join part p on li.l_partkey = p.p_partkey }}} represented as

    Seq(AttributeReference("l_partkey")), Seq(AttributeReference("p_partkey"))

    The following constraints must be met for the joining condition to be a join from this StarSchema:

    • Every joining expressions can only be an AttributeReference
    • each set of joining conditions(leftJoinKeys, rightJoinKeys) must be on 1 table.
    • the 2 tables must be related in the StarSchema.
    • the matching Attributes in the input(leftJoinKeys, rightJoinKeys) must exactly nmatch the joining key defined in the StarSchema for the 2 tables involved.
    leftJoinKeys
    rightJoinKeys
    returns

  18. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  19. final def notify(): Unit

    Definition Classes
    AnyRef
  20. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  21. def prettyString: String

  22. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  23. val tableMap: Map[String, StarTable]

    maps a tableName to the StarTable node in the StarSchema Graph.

  24. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped