package sources
- Alphabetic
- Public
- All
Type Members
-
class
ConsoleWrite extends StreamingWrite with Logging
Common methods used to create writes for the console sink
-
class
ContinuousMemoryStream[A] extends MemoryStreamBase[A] with ContinuousStream
The overall strategy here is: * ContinuousMemoryStream maintains a list of records for each partition.
The overall strategy here is: * ContinuousMemoryStream maintains a list of records for each partition. addData() will distribute records evenly-ish across partitions. * RecordEndpoint is set up as an endpoint for executor-side ContinuousMemoryStreamInputPartitionReader instances to poll. It returns the record at the specified offset within the list, or null if that offset doesn't yet have a record.
-
case class
ContinuousMemoryStreamInputPartition(driverEndpointName: String, partition: Int, startOffset: Int) extends InputPartition with Product with Serializable
An input partition for continuous memory stream.
- case class ContinuousMemoryStreamOffset(partitionNums: Map[Int, Int]) extends connector.read.streaming.Offset with Product with Serializable
-
class
ContinuousMemoryStreamPartitionReader extends ContinuousPartitionReader[InternalRow]
An input partition reader for continuous memory stream.
An input partition reader for continuous memory stream.
Polls the driver endpoint for new records.
- class ForeachBatchSink[T] extends Sink
-
class
ForeachDataWriter[T] extends DataWriter[InternalRow]
A DataWriter which writes data in this partition to a ForeachWriter.
A DataWriter which writes data in this partition to a ForeachWriter.
- T
The type expected by the writer.
- case class ForeachWriterFactory[T](writer: ForeachWriter[T], rowConverter: (InternalRow) ⇒ T) extends StreamingDataWriterFactory with Product with Serializable
-
case class
ForeachWriterTable[T](writer: ForeachWriter[T], converter: Either[ExpressionEncoder[T], (InternalRow) ⇒ T]) extends Table with SupportsWrite with Product with Serializable
A write-only table for forwarding data into the specified ForeachWriter.
A write-only table for forwarding data into the specified ForeachWriter.
- T
The expected type of the sink.
- writer
The ForeachWriter to process all data.
- converter
An object to convert internal rows to target type T. Either it can be a ExpressionEncoder or a direct converter function.
- class MemoryDataWriter extends DataWriter[InternalRow] with Logging
-
case class
MemoryPlan(sink: MemorySink, output: Seq[Attribute]) extends LeafNode with Product with Serializable
Used to query the data that has been written into a MemorySink.
-
class
MemorySink extends Table with SupportsWrite with Logging
A sink that stores the results in memory.
A sink that stores the results in memory. This org.apache.spark.sql.execution.streaming.Sink is primarily intended for use in unit tests and does not provide durability.
- class MemoryStreamingWrite extends StreamingWrite
- case class MemoryWriterCommitMessage(partition: Int, data: Seq[Row]) extends WriterCommitMessage with Product with Serializable
- case class MemoryWriterFactory(schema: StructType) extends DataWriterFactory with StreamingDataWriterFactory with Product with Serializable
-
class
MicroBatchWrite extends BatchWrite
A BatchWrite used to hook V2 stream writers into a microbatch plan.
A BatchWrite used to hook V2 stream writers into a microbatch plan. It implements the non-streaming interface, forwarding the epoch ID determined at construction to a wrapped streaming write support.
- class MicroBatchWriterFactory extends DataWriterFactory
-
case class
PackedRowCommitMessage(rows: Array[InternalRow]) extends WriterCommitMessage with Product with Serializable
Commit message for a PackedRowDataWriter, containing all the rows written in the most recent interval.
-
class
PackedRowDataWriter extends DataWriter[InternalRow] with Logging
A simple DataWriter that just sends all the rows it's received as a commit message.
-
trait
PythonForeachBatchFunction extends AnyRef
Interface that is meant to be extended by Python classes via Py4J.
Interface that is meant to be extended by Python classes via Py4J. Py4J allows Python classes to implement Java interfaces so that the JVM can call back Python objects. In this case, this allows the user-defined Python
foreachBatch
function to be called from JVM when the query is active. - case class RateStreamMicroBatchInputPartition(partitionId: Int, numPartitions: Int, rangeStart: Long, rangeEnd: Long, localStartTimeMs: Long, relativeMsPerValue: Double) extends InputPartition with Product with Serializable
- class RateStreamMicroBatchPartitionReader extends PartitionReader[InternalRow]
- class RateStreamMicroBatchStream extends MicroBatchStream with Logging
-
class
RateStreamProvider extends SimpleTableProvider with DataSourceRegister
A source that generates increment long values with timestamps.
A source that generates increment long values with timestamps. Each generated row has two columns: a timestamp column for the generated time and an auto increment long column starting with 0L.
This source supports the following options:
rowsPerSecond
(e.g. 100, default: 1): How many rows should be generated per second.rampUpTime
(e.g. 5s, default: 0s): How long to ramp up before the generating speed becomesrowsPerSecond
. Using finer granularities than seconds will be truncated to integer seconds.numPartitions
(e.g. 10, default: Spark's default parallelism): The partition number for the generated rows. The source will try its best to reachrowsPerSecond
, but the query may be resource constrained, andnumPartitions
can be tweaked to help reach the desired speed.
- class RateStreamTable extends Table with SupportsRead
- case class TextSocketInputPartition(slice: ListBuffer[(UTF8String, Long)]) extends InputPartition with Product with Serializable
-
class
TextSocketMicroBatchStream extends MicroBatchStream with Logging
A MicroBatchReadSupport that reads text lines through a TCP socket, designed only for tutorials and debugging.
A MicroBatchReadSupport that reads text lines through a TCP socket, designed only for tutorials and debugging. This MicroBatchReadSupport will *not* work in production applications due to multiple reasons, including no support for fault recovery.
- class TextSocketSourceProvider extends SimpleTableProvider with DataSourceRegister with Logging
- class TextSocketTable extends Table with SupportsRead
-
case class
WriteToMicroBatchDataSource(write: StreamingWrite, query: LogicalPlan) extends LogicalPlan with Product with Serializable
The logical plan for writing data to a micro-batch stream.
The logical plan for writing data to a micro-batch stream.
Note that this logical plan does not have a corresponding physical plan, as it will be converted to WriteToDataSourceV2 with MicroBatchWrite before execution.
Value Members
- object ContinuousMemoryStream
- object ContinuousMemoryStreamReaderFactory extends ContinuousPartitionReaderFactory
-
object
ForeachWriterCommitMessage extends WriterCommitMessage with Product with Serializable
An empty WriterCommitMessage.
An empty WriterCommitMessage. ForeachWriter implementations have no global coordination.
- object ForeachWriterTable extends Serializable
-
object
PackedRowWriterFactory extends StreamingDataWriterFactory with Product with Serializable
A simple org.apache.spark.sql.connector.write.DataWriterFactory whose tasks just pack rows into the commit message for delivery to a org.apache.spark.sql.connector.write.BatchWrite on the driver.
A simple org.apache.spark.sql.connector.write.DataWriterFactory whose tasks just pack rows into the commit message for delivery to a org.apache.spark.sql.connector.write.BatchWrite on the driver.
Note that, because it sends all rows to the driver, this factory will generally be unsuitable for production-quality sinks. It's intended for use in tests.
- object PythonForeachBatchHelper
- object RateStreamMicroBatchReaderFactory extends PartitionReaderFactory
- object RateStreamProvider
- object TextSocketReader