Package

io.eels

datastream

Permalink

package datastream

Visibility
  1. Public
  2. All

Type Members

  1. trait Aggregation extends AnyRef

    Permalink
  2. trait DataStream extends Logging

    Permalink

    A DataStream is kind of like a table of data.

    A DataStream is kind of like a table of data. It has fields (like columns) and rows of data. Each row has an entry for each field (this may be null depending on the field definition).

    It is a lazily evaluated data structure. Each operation on a stream will create a new derived stream, but those operations will only occur when a final action is performed.

    You can create a DataStream from an IO source, such as a Parquet file or a Hive table, or you may create a fully evaluated one from an in memory structure. In the case of the former, the data will only be loaded on demand as an action is performed.

    A DataStream is split into one or more flows. Each flow can operate independantly of the others. For example, if you filter a flow, each flow will be filtered seperately, which allows it to be parallelized. If you write out a flow, each partition can be written out to individual files, again allowing parallelization.

  3. class DataStreamPublisher extends DataStream

    Permalink

    An implementation of DataStream for which items are emitted by calling publish.

    An implementation of DataStream for which items are emitted by calling publish. When no more items are to be published, call close() so that downstream subscribers can complete.

    Subscribers to this publisher will block as normal, and so they should normally be placed into a separate thread.

  4. class DataStreamSource extends DataStream with Using with Logging

    Permalink
  5. abstract class DefaultAggregation extends Aggregation

    Permalink
  6. class DelegateSubscriber[T] extends Subscriber[T]

    Permalink
  7. class ExistsSubscriber extends Subscriber[Seq[Row]] with Logging

    Permalink
  8. class FindSubscriber extends Subscriber[Seq[Row]] with Logging

    Permalink
  9. trait GroupedDataStream extends AnyRef

    Permalink
  10. case class IteratorAction(ds: DataStream) extends Product with Serializable

    Permalink
  11. trait Publisher[T] extends AnyRef

    Permalink
  12. case class SinkAction(ds: DataStream, sink: Sink, parallelism: Int) extends Logging with Product with Serializable

    Permalink
  13. trait Subscriber[T] extends AnyRef

    Permalink
  14. trait Subscription extends AnyRef

    Permalink

Value Members

  1. object Aggregation

    Permalink
  2. object DataStream

    Permalink
  3. object GroupedDataStream

    Permalink
  4. object Publisher extends Logging

    Permalink
  5. object Subscription

    Permalink

Ungrouped