org.apache.spark.sql.execution.streaming

Source

trait Source extends SparkDataStream

A source of continually arriving data for a streaming query. A Source must have a monotonically increasing notion of progress that can be represented as an Offset. Spark will regularly query each Source to see if any more data is available.

Note that, we extends SparkDataStream here, to make the v1 streaming source API be compatible with data source v2.

Linear Supertypes

SparkDataStream, AnyRef, Any

Known Subclasses

AvailableNowSourceWrapper, FileStreamSource

Ordering

Alphabetic
By Inheritance

Inherited

Source
SparkDataStream
AnyRef
Any

Hide All
Show All

Visibility

Public
Protected

Abstract Value Members

abstract def getBatch(start: Option[Offset], end: Offset): DataFrame
Returns the data that is between the offsets (start, end].
Returns the data that is between the offsets (start, end]. When start is None, then the batch should begin with the first record. This method must always return the same data for a particular start and end pair; even after the Source has been restarted on a different node.
Higher layers will always call this method with a value of start greater than or equal to the last value passed to commit and a value of end less than or equal to the last value returned by getOffset
It is possible for the Offset type to be a SerializedOffset when it was obtained from the log. Moreover, StreamExecution only compares the Offset JSON representation to determine if the two objects are equal. This could have ramifications when upgrading Offset JSON formats i.e., two equivalent Offset objects could differ between version. Consequently, StreamExecution may call this method with two such equivalent Offset objects. In which case, the Source should return an empty DataFrame
abstract def getOffset: Option[Offset]
Returns the maximum available offset for this source.
Returns the maximum available offset for this source. Returns None if this source has never received any data.
abstract def schema: StructType
Returns the schema of the data from this source
abstract def stop(): Unit
Definition Classes
SparkDataStream

Concrete Value Members

final def !=(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def ##: Int
Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0
Definition Classes
Any
def clone(): AnyRef
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.CloneNotSupportedException]) @native()
def commit(end: connector.read.streaming.Offset): Unit
Definition Classes
Source → SparkDataStream
def commit(end: Offset): Unit
Informs the source that Spark has completed processing all data for offsets less than or equal to end and will only request offsets greater than end in the future.
def deserializeOffset(json: String): connector.read.streaming.Offset
Definition Classes
Source → SparkDataStream
final def eq(arg0: AnyRef): Boolean
Definition Classes
AnyRef
def equals(arg0: AnyRef): Boolean
Definition Classes
AnyRef → Any
def finalize(): Unit
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.Throwable])
final def getClass(): Class[_ <: AnyRef]
Definition Classes
AnyRef → Any
Annotations
@native()
def hashCode(): Int
Definition Classes
AnyRef → Any
Annotations
@native()
def initialOffset(): connector.read.streaming.Offset
Definition Classes
Source → SparkDataStream
final def isInstanceOf[T0]: Boolean
Definition Classes
Any
final def ne(arg0: AnyRef): Boolean
Definition Classes
AnyRef
final def notify(): Unit
Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit
Definition Classes
AnyRef
Annotations
@native()
final def synchronized[T0](arg0: => T0): T0
Definition Classes
AnyRef
def toString(): String
Definition Classes
AnyRef → Any
final def wait(): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long, arg1: Int): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException]) @native()

Packages

Source

trait Source extends SparkDataStream

Abstract Value Members

Concrete Value Members

Inherited from SparkDataStream

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

Source

trait Source extends SparkDataStream

Abstract Value Members

Concrete Value Members

Inherited from SparkDataStream

Inherited from AnyRef

Inherited from Any

Ungrouped

Source