case class Table(columns: Vector[Column], colNames: Vector[String], uniqueId: String, partitions: Option[PartitionData]) extends RelationalAlgebra with Product with Serializable

Linear Supertypes
Serializable, Product, Equals, RelationalAlgebra, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Table
  2. Serializable
  3. Product
  4. Equals
  5. RelationalAlgebra
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Instance Constructors

  1. new Table(columns: Vector[Column], colNames: Vector[String], uniqueId: String, partitions: Option[PartitionData])

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def addColOfSameSegmentation(c: Column, colName: String)(implicit tsc: TaskSystemComponents): IO[Table]

    Concat list of columns

    Concat list of columns

    Definition Classes
    RelationalAlgebra
  5. def addColumnFromSeq(tag: ColumnTag, columnName: String)(elems: Seq[Elem])(implicit tsc: TaskSystemComponents): IO[Table]
  6. def apply(i: Int): Column
  7. def apply(s: String): Column
  8. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  9. def bufferSegment(idx: Int)(implicit tsc: TaskSystemComponents): IO[BufferedTable]
  10. def bufferStream(implicit tsc: TaskSystemComponents): Stream[IO, BufferedTable]
  11. def clone(): AnyRef
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
  12. val colNames: Vector[String]
  13. val columns: Vector[Column]
  14. def concatenate(others: Table*)(implicit tsc: TaskSystemComponents): IO[Table]

    This is almost noop, concat the list of segments

    This is almost noop, concat the list of segments

    Definition Classes
    RelationalAlgebra
  15. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  16. def equijoin(other: Table, joinColumnSelf: Int, joinColumnOther: Int, how: String, partitionBase: Int, partitionLimit: Int, maxSegmentsToBufferAtOnce: Int)(query: (TableReference, TableReference) => Expr { type T <: ra3.lang.ReturnValue })(implicit tsc: TaskSystemComponents): IO[Table]

    • Partition both tables by join column
    • For each partition of both input tables
    • Buffer the partition completely (all segments, all columns)
    • Join buffered tables in memory, use saddle's Index?
    • concat joined partitions
    Definition Classes
    RelationalAlgebra
  17. def equijoinMultiple(joinColumnSelf: Int, others: Seq[(Table, Int, String, Int)], partitionBase: Int, partitionLimit: Int)(query: (Seq[TableReference]) => Expr { type T <: ra3.lang.ReturnValue })(implicit tsc: TaskSystemComponents): IO[Table]
    Definition Classes
    RelationalAlgebra
  18. def exportToCsv(columnSeparator: Char = ',', quoteChar: Char = '"', recordSeparator: String = "\r\n", compression: Option[CompressionFormat] = Some(ExportCsv.Gzip))(implicit tsc: TaskSystemComponents): IO[List[SharedFile]]
    Definition Classes
    RelationalAlgebra
  19. def filterColumnNames(nameSuffix: String)(p: (String) => Boolean): Table
    Definition Classes
    RelationalAlgebra
  20. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @IntrinsicCandidate() @native()
  21. def groupBy(cols: Seq[Int], partitionBase: Int, partitionLimit: Int, maxSegmentsToBufferAtOnce: Int)(implicit tsc: TaskSystemComponents): IO[GroupedTable]

    Group by which return group locations

    Group by which return group locations

    Returns a triple for each input segment: group map, number of groups, group sizes

    Definition Classes
    RelationalAlgebra
  22. def groupBySegments(cols: Seq[Int])(implicit tsc: TaskSystemComponents): IO[GroupedTable]

    Group by without partitioning

    Group by without partitioning

    Useful to reduce the segments without partitioning

    Definition Classes
    RelationalAlgebra
  23. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  24. def mapColIndex(f: (String) => String): Table
  25. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  26. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @IntrinsicCandidate() @native()
  27. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @IntrinsicCandidate() @native()
  28. def numCols: Int
  29. def numRows: Long
  30. def partition(columnIdx: Seq[Int], partitionBase: Int, numPartitionsIsImportant: Boolean, partitionLimit: Int, maxSegmentsToBufferAtOnce: Int)(implicit tsc: TaskSystemComponents): IO[Vector[PartitionedTable]]
    Definition Classes
    RelationalAlgebra
  31. val partitions: Option[PartitionData]
  32. def pivot(columnGroupRows: Int, columnGroupColumns: Int, valueColumn: Int): IO[Table]

    Pivot is two nested group by followed by aggregation and rearranging the results into a new table

    Pivot is two nested group by followed by aggregation and rearranging the results into a new table

    • Get all distinct elements of columnGroupColumns. Use group by for this. This is the new list of columns.
    • Partition by columnGroupRows
    • Buffer all three columns of a partition, and pivot it in mem. Use the list of columns, place nulls if needed.
    • Concatenate
    Definition Classes
    RelationalAlgebra
  33. def prePartition(columnIdx: Seq[Int], partitionBase: Int, partitionLimit: Int, maxSegmentsToBufferAtOnce: Int)(implicit tsc: TaskSystemComponents): IO[Table]
    Definition Classes
    RelationalAlgebra
  34. def productElementNames: Iterator[String]
    Definition Classes
    Product
  35. def query(query: (TableReference) => Query)(implicit tsc: TaskSystemComponents): IO[Table]
    Definition Classes
    RelationalAlgebra
  36. def reduceTable(query: (TableReference) => Expr { type T <: ra3.lang.ReturnValue })(implicit tsc: TaskSystemComponents): IO[Table]
    Definition Classes
    RelationalAlgebra
  37. def rfilter(predicate: Column)(implicit tsc: TaskSystemComponents): IO[Table]

    Variant which takes BufferedTable => BufferInt

    • Align predicate segment with table segmentation
    • For each aligned predicate segment, buffer it
    • For each column
    • For each segment in the column
    • Buffer column segment
    • Apply buffered predicate segment to buffered column segment
    • Write applied buffer to local segment
    • Resegment

    Variant which takes BufferedTable => BufferInt

    Definition Classes
    RelationalAlgebra
  38. def rfilterInEquality(columnIdx: Int, cutoff: Segment, lessThan: Boolean)(implicit tsc: TaskSystemComponents): IO[Table]
    Definition Classes
    RelationalAlgebra
  39. def segmentation: AbstractSeq[Int] with StrictOptimizedSeqOps[Int, [_]AbstractSeq[_] with StrictOptimizedSeqOps[_, [_]AbstractSeq[_] with StrictOptimizedSeqOps[_, [_]AbstractSeq[_] with DefaultSerializable, AbstractSeq[_] with DefaultSerializable] with DefaultSerializable, AbstractSeq[_] with StrictOptimizedSeqOps[_, [_]AbstractSeq[_] with DefaultSerializable, AbstractSeq[_] with DefaultSerializable] with DefaultSerializable] with DefaultSerializable { def iterableFactory: scala.collection.SeqFactory[[A]scala.collection.immutable.AbstractSeq[A] with scala.collection.generic.DefaultSerializable] }, AbstractSeq[Int] with StrictOptimizedSeqOps[Int, [_]AbstractSeq[_] with StrictOptimizedSeqOps[_, [_]AbstractSeq[_] with DefaultSerializable, AbstractSeq[_] with DefaultSerializable] with DefaultSerializable, AbstractSeq[Int] with StrictOptimizedSeqOps[Int, [_]AbstractSeq[_] with DefaultSerializable, AbstractSeq[Int] with DefaultSerializable] with DefaultSerializable] with DefaultSerializable { def iterableFactory: scala.collection.SeqFactory[[A]scala.collection.immutable.AbstractSeq[A] with scala.collection.generic.DefaultSerializable] }] with DefaultSerializable { ... /* 3 definitions in type refinement */ }
  40. def selectColumns(columnIndexes: Int*)(implicit tsc: TaskSystemComponents): IO[Table]

    This is almost noop, select columns

    This is almost noop, select columns

    Definition Classes
    RelationalAlgebra
  41. def showSample(nrows: Int = 100, ncols: Int = 10)(implicit tsc: TaskSystemComponents): IO[String]
  42. def sort(sortColumn: Int, ascending: Boolean): IO[Table]

    \== Sorting

    \== Sorting

    We sort by parallel distributed sort

    We sort only on 1 colum

    • We need an estimate of the CDF (see doc of other method)
    • From the approximate CDF we select n values which partition the data evenly into n+1 partitions
    • We write those partitions (all columns) - Sort the partitions (all columns)
    • Rearrange the sorted partitions in the correct order
    Definition Classes
    RelationalAlgebra
  43. def stringify(segmentIdx: Int = 0, nrows: Int = 10, ncols: Int = 10)(implicit tsc: TaskSystemComponents): IO[String]
  44. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  45. def take(indexes: Int32Column)(implicit tsc: TaskSystemComponents): IO[Table]

    • For each aligned index segment, buffer it
    • For each column
    • For each segment in the column
    • Buffer column segment
    • Apply buffered predicate segment to buffered column segment
    • Write applied buffer to segment and upload
    indexes

    for each segment

    Definition Classes
    RelationalAlgebra
  46. def toString(): String
    Definition Classes
    Table → AnyRef → Any
  47. def topK(sortColumn: Int, ascending: Boolean, k: Int, cdfCoverage: Double, cdfNumberOfSamplesPerSegment: Int)(implicit tsc: TaskSystemComponents): IO[Table]

    \= Top K selection

    \= Top K selection

    • We need an estimate of the CDF
    • From the approximate CDF we select the V value below which K elements fall
    • Scan all segments and find the index set which picks those elements below V . TakeIndex on all columns
    • Rearrange into table
    Definition Classes
    RelationalAlgebra
  48. val uniqueId: String
  49. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  50. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()
  51. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable]) @Deprecated
    Deprecated

    (Since version 9)

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from RelationalAlgebra

Inherited from AnyRef

Inherited from Any

Ungrouped