Packages

package sources

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. class ConsoleWrite extends StreamingWrite with Logging

    Common methods used to create writes for the console sink

  2. class ContinuousMemoryStream[A] extends MemoryStreamBase[A] with ContinuousStream

    The overall strategy here is: * ContinuousMemoryStream maintains a list of records for each partition.

    The overall strategy here is: * ContinuousMemoryStream maintains a list of records for each partition. addData() will distribute records evenly-ish across partitions. * RecordEndpoint is set up as an endpoint for executor-side ContinuousMemoryStreamInputPartitionReader instances to poll. It returns the record at the specified offset within the list, or null if that offset doesn't yet have a record.

  3. case class ContinuousMemoryStreamInputPartition(driverEndpointName: String, partition: Int, startOffset: Int) extends InputPartition with Product with Serializable

    An input partition for continuous memory stream.

  4. case class ContinuousMemoryStreamOffset(partitionNums: Map[Int, Int]) extends connector.read.streaming.Offset with Product with Serializable
  5. class ContinuousMemoryStreamPartitionReader extends ContinuousPartitionReader[InternalRow]

    An input partition reader for continuous memory stream.

    An input partition reader for continuous memory stream.

    Polls the driver endpoint for new records.

  6. class ForeachBatchSink[T] extends Sink
  7. class ForeachDataWriter[T] extends DataWriter[InternalRow]

    A DataWriter which writes data in this partition to a ForeachWriter.

    A DataWriter which writes data in this partition to a ForeachWriter.

    T

    The type expected by the writer.

  8. case class ForeachWriterFactory[T](writer: ForeachWriter[T], rowConverter: (InternalRow) ⇒ T) extends StreamingDataWriterFactory with Product with Serializable
  9. case class ForeachWriterTable[T](writer: ForeachWriter[T], converter: Either[ExpressionEncoder[T], (InternalRow) ⇒ T]) extends Table with SupportsWrite with Product with Serializable

    A write-only table for forwarding data into the specified ForeachWriter.

    A write-only table for forwarding data into the specified ForeachWriter.

    T

    The expected type of the sink.

    writer

    The ForeachWriter to process all data.

    converter

    An object to convert internal rows to target type T. Either it can be a ExpressionEncoder or a direct converter function.

  10. class MemoryDataWriter extends DataWriter[InternalRow] with Logging
  11. case class MemoryPlan(sink: MemorySink, output: Seq[Attribute]) extends LeafNode with Product with Serializable

    Used to query the data that has been written into a MemorySink.

  12. class MemorySink extends Table with SupportsWrite with Logging

    A sink that stores the results in memory.

    A sink that stores the results in memory. This org.apache.spark.sql.execution.streaming.Sink is primarily intended for use in unit tests and does not provide durability.

  13. class MemoryStreamingWrite extends StreamingWrite
  14. case class MemoryWriterCommitMessage(partition: Int, data: Seq[Row]) extends WriterCommitMessage with Product with Serializable
  15. case class MemoryWriterFactory(schema: StructType) extends DataWriterFactory with StreamingDataWriterFactory with Product with Serializable
  16. class MicroBatchWrite extends BatchWrite

    A BatchWrite used to hook V2 stream writers into a microbatch plan.

    A BatchWrite used to hook V2 stream writers into a microbatch plan. It implements the non-streaming interface, forwarding the epoch ID determined at construction to a wrapped streaming write support.

  17. class MicroBatchWriterFactory extends DataWriterFactory
  18. case class PackedRowCommitMessage(rows: Array[InternalRow]) extends WriterCommitMessage with Product with Serializable

    Commit message for a PackedRowDataWriter, containing all the rows written in the most recent interval.

  19. class PackedRowDataWriter extends DataWriter[InternalRow] with Logging

    A simple DataWriter that just sends all the rows it's received as a commit message.

  20. trait PythonForeachBatchFunction extends AnyRef

    Interface that is meant to be extended by Python classes via Py4J.

    Interface that is meant to be extended by Python classes via Py4J. Py4J allows Python classes to implement Java interfaces so that the JVM can call back Python objects. In this case, this allows the user-defined Python foreachBatch function to be called from JVM when the query is active.

  21. case class RateStreamMicroBatchInputPartition(partitionId: Int, numPartitions: Int, rangeStart: Long, rangeEnd: Long, localStartTimeMs: Long, relativeMsPerValue: Double) extends InputPartition with Product with Serializable
  22. class RateStreamMicroBatchPartitionReader extends PartitionReader[InternalRow]
  23. class RateStreamMicroBatchStream extends MicroBatchStream with Logging
  24. class RateStreamProvider extends SimpleTableProvider with DataSourceRegister

    A source that generates increment long values with timestamps.

    A source that generates increment long values with timestamps. Each generated row has two columns: a timestamp column for the generated time and an auto increment long column starting with 0L.

    This source supports the following options:

    • rowsPerSecond (e.g. 100, default: 1): How many rows should be generated per second.
    • rampUpTime (e.g. 5s, default: 0s): How long to ramp up before the generating speed becomes rowsPerSecond. Using finer granularities than seconds will be truncated to integer seconds.
    • numPartitions (e.g. 10, default: Spark's default parallelism): The partition number for the generated rows. The source will try its best to reach rowsPerSecond, but the query may be resource constrained, and numPartitions can be tweaked to help reach the desired speed.
  25. class RateStreamTable extends Table with SupportsRead
  26. case class TextSocketInputPartition(slice: ListBuffer[(UTF8String, Long)]) extends InputPartition with Product with Serializable
  27. class TextSocketMicroBatchStream extends MicroBatchStream with Logging

    A MicroBatchReadSupport that reads text lines through a TCP socket, designed only for tutorials and debugging.

    A MicroBatchReadSupport that reads text lines through a TCP socket, designed only for tutorials and debugging. This MicroBatchReadSupport will *not* work in production applications due to multiple reasons, including no support for fault recovery.

  28. class TextSocketSourceProvider extends SimpleTableProvider with DataSourceRegister with Logging
  29. class TextSocketTable extends Table with SupportsRead
  30. case class WriteToMicroBatchDataSource(write: StreamingWrite, query: LogicalPlan) extends LogicalPlan with Product with Serializable

    The logical plan for writing data to a micro-batch stream.

    The logical plan for writing data to a micro-batch stream.

    Note that this logical plan does not have a corresponding physical plan, as it will be converted to WriteToDataSourceV2 with MicroBatchWrite before execution.

Value Members

  1. object ContinuousMemoryStream
  2. object ContinuousMemoryStreamReaderFactory extends ContinuousPartitionReaderFactory
  3. object ForeachWriterCommitMessage extends WriterCommitMessage with Product with Serializable

    An empty WriterCommitMessage.

    An empty WriterCommitMessage. ForeachWriter implementations have no global coordination.

  4. object ForeachWriterTable extends Serializable
  5. object PackedRowWriterFactory extends StreamingDataWriterFactory with Product with Serializable

    A simple org.apache.spark.sql.connector.write.DataWriterFactory whose tasks just pack rows into the commit message for delivery to a org.apache.spark.sql.connector.write.BatchWrite on the driver.

    A simple org.apache.spark.sql.connector.write.DataWriterFactory whose tasks just pack rows into the commit message for delivery to a org.apache.spark.sql.connector.write.BatchWrite on the driver.

    Note that, because it sends all rows to the driver, this factory will generally be unsuitable for production-quality sinks. It's intended for use in tests.

  6. object PythonForeachBatchHelper
  7. object RateStreamMicroBatchReaderFactory extends PartitionReaderFactory
  8. object RateStreamProvider
  9. object TextSocketReader

Ungrouped