abstract class StreamExecution extends StreamingQuery with Logging
Manages the execution of a streaming Spark SQL query that is occurring in a separate thread. Unlike a standard query, a streaming query executes repeatedly each time new data arrives at any Source present in the query plan. Whenever new data arrives, a QueryExecution is created and the results are committed transactionally to the given Sink.
- Alphabetic
- By Inheritance
- StreamExecution
- Logging
- StreamingQuery
- StreamingQuery
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new StreamExecution(sparkSession: classic.SparkSession, name: String, resolvedCheckpointRoot: String, analyzedPlan: LogicalPlan, sink: Table, trigger: Trigger, triggerClock: Clock, outputMode: OutputMode, deleteCheckpointOnStop: Boolean)
- deleteCheckpointOnStop
whether to delete the checkpoint if the query is stopped without errors. Checkpoint deletion can be forced with the appropriate Spark configuration.
Type Members
- implicit class LogStringContext extends AnyRef
- Definition Classes
- Logging
Abstract Value Members
- abstract def getLatestExecutionContext(): StreamExecutionContext
Get the latest execution context .
- abstract def logicalPlan: LogicalPlan
The base logical plan which will be used across batch runs.
The base logical plan which will be used across batch runs. Once the value is set, it should not be modified.
- abstract def runActivatedStream(sparkSessionForStream: classic.SparkSession): Unit
Run the activated stream until stopped.
Run the activated stream until stopped.
- Attributes
- protected
- abstract def sources: Seq[SparkDataStream]
The list of stream instances which will be used across batch runs.
The list of stream instances which will be used across batch runs. Once the value is set, it should not be modified.
- Attributes
- protected
- abstract def stop(): Unit
- Definition Classes
- StreamingQuery
- Annotations
- @throws(scala.this.throws.<init>$default$1[java.util.concurrent.TimeoutException])
Concrete Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- val analyzedPlan: LogicalPlan
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def availableOffsets: StreamProgress
Get the end or formerly know as "available" offsets of the latest batch that has been planned
- def awaitInitialization(timeoutMs: Long): Unit
Await until all fields of the query have been initialized.
- val awaitProgressLock: ReentrantLock
A lock used to wait/notify when batches complete.
A lock used to wait/notify when batches complete. Use a fair lock to avoid thread starvation.
- Attributes
- protected
- val awaitProgressLockCondition: Condition
- Attributes
- protected
- def awaitTermination(timeoutMs: Long): Boolean
- Definition Classes
- StreamExecution → StreamingQuery
- def awaitTermination(): Unit
- Definition Classes
- StreamExecution → StreamingQuery
- def checkpointFile(name: String): String
Returns the path of a file with
namein the checkpoint directory.Returns the path of a file with
namein the checkpoint directory.- Attributes
- protected
- val checkpointMetadata: StreamingQueryCheckpointMetadata
Manages the metadata from this checkpoint location.
Manages the metadata from this checkpoint location.
- Attributes
- protected
- def cleanup(): Unit
Any clean up that needs to happen when the query is stopped or exits
Any clean up that needs to happen when the query is stopped or exits
- Attributes
- protected
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
- lazy val commitLog: CommitLog
- var committedOffsets: StreamProgress
Tracks how much data we have processed and committed to the sink or state store from each input source.
Tracks how much data we have processed and committed to the sink or state store from each input source. Only the scheduler thread should modify this field, and only in atomic steps. Other threads should make a shallow copy if they are going to access this field more than once, since the field's value may change at any time.
- def createWrite(table: SupportsWrite, options: Map[String, String], inputPlan: LogicalPlan): Write
- Attributes
- protected
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def exception: Option[StreamingQueryException]
Returns the StreamingQueryException if the query was terminated by an exception.
Returns the StreamingQueryException if the query was terminated by an exception.
- Definition Classes
- StreamExecution → StreamingQuery
- def explain(): Unit
- Definition Classes
- StreamExecution → StreamingQuery
- def explain(extended: Boolean): Unit
- Definition Classes
- StreamExecution → StreamingQuery
- def explainInternal(extended: Boolean): String
Expose for tests
- def getBatchDescriptionString: String
- Attributes
- protected
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- def getStartOffsetsOfLatestBatch: StreamProgress
Get the start offsets of the latest batch that has been planned
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- val id: UUID
- Definition Classes
- StreamExecution → StreamingQuery
- def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- def initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def interruptAndAwaitExecutionThreadTermination(): Unit
Interrupts the query execution thread and awaits its termination until until it exceeds the timeout.
Interrupts the query execution thread and awaits its termination until until it exceeds the timeout. The timeout can be set on "spark.sql.streaming.stopTimeout".
- Attributes
- protected
- Annotations
- @throws(scala.this.throws.<init>$default$1[java.util.concurrent.TimeoutException])
- Exceptions thrown
TimeoutExceptionIf the thread cannot be stopped within the timeout
- def isActive: Boolean
Whether the query is currently active or not
Whether the query is currently active or not
- Definition Classes
- StreamExecution → StreamingQuery
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- def lastExecution: IncrementalExecution
- def lastProgress: StreamingQueryProgress
- Definition Classes
- StreamExecution → StreamingQuery
- def latestOffsets: StreamProgress
- def log: Logger
- Attributes
- protected
- Definition Classes
- Logging
- def logBasedOnLevel(level: Level)(f: => MessageWithContext): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logDebug(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logDebug(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logDebug(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logDebug(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logName: String
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarning(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarning(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarning(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarning(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- var loggingThreadContext: Instance
- Attributes
- protected
- val minLogEntriesToMaintain: Int
- Attributes
- protected
- val name: String
- Definition Classes
- StreamExecution → StreamingQuery
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- var noNewData: Boolean
A flag to indicate that a batch has completed with no new data available.
A flag to indicate that a batch has completed with no new data available.
- Attributes
- protected
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- lazy val offsetLog: OffsetSeqLog
- val outputMode: OutputMode
- val pollingDelayMs: Long
- Attributes
- protected
- def postEvent(event: Event): Unit
- Attributes
- protected
- val prettyIdString: String
Pretty identified string of printing in logs.
Pretty identified string of printing in logs. Format is If name is set "queryName [id = xyz, runId = abc]" else "[id = xyz, runId = abc]"
- Attributes
- protected
- def processAllAvailable(): Unit
- Definition Classes
- StreamExecution → StreamingQuery
- val progressReporter: ProgressReporter
- Attributes
- protected
- def purge(threshold: Long): Unit
- Attributes
- protected
- def purgeStatefulMetadata(plan: SparkPlan): Unit
- Attributes
- protected
- val queryExecutionThread: QueryExecutionThread
The thread that runs the micro-batches of this stream.
The thread that runs the micro-batches of this stream. Note that this thread must be org.apache.spark.util.UninterruptibleThread to workaround KAFKA-1894: interrupting a running
KafkaConsumermay cause endless loop. - def recentProgress: Array[StreamingQueryProgress]
- Definition Classes
- StreamExecution → StreamingQuery
- val resolvedCheckpointRoot: String
- val runId: UUID
- Definition Classes
- StreamExecution → StreamingQuery
- val sink: Table
- val sparkSession: classic.SparkSession
<invalid inheritdoc annotation>
<invalid inheritdoc annotation>
- Definition Classes
- StreamExecution → StreamingQuery → StreamingQuery
- val sparkSessionForStream: classic.SparkSession
Isolated spark session to run the batches with.
Isolated spark session to run the batches with.
- Attributes
- protected
- def start(): Unit
Starts the execution.
Starts the execution. This returns only after the thread has started and QueryStartedEvent has been posted to all the listeners.
- val state: AtomicReference[State]
Defines the internal state of execution
Defines the internal state of execution
- Attributes
- protected
- def status: StreamingQueryStatus
- Definition Classes
- StreamExecution → StreamingQuery
- def stopSources(): Unit
Stops all streaming sources safely.
Stops all streaming sources safely.
- Attributes
- protected
- var streamDeathCause: StreamingQueryException
- Attributes
- protected
- lazy val streamMetrics: MetricsReporter
Used to report metrics to coda-hale.
Used to report metrics to coda-hale. This uses id for easier tracking across restarts.
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- StreamExecution → AnyRef → Any
- val trigger: Trigger
- val triggerClock: Clock
- var uniqueSources: Map[SparkDataStream, ReadLimit]
A list of unique sources in the query plan.
A list of unique sources in the query plan. This will be set when generating logical plan.
- Attributes
- protected
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- val watermarkMsMap: Map[Int, Long]
A map of current watermarks, keyed by the position of the watermark operator in the physical plan.
A map of current watermarks, keyed by the position of the watermark operator in the physical plan.
This state is 'soft state', which does not affect the correctness and semantics of watermarks and is not persisted across query restarts. The fault-tolerant watermark state is in offsetSeqMetadata.
- Attributes
- protected
- def withLogContext(context: Map[String, String])(body: => Unit): Unit
- Attributes
- protected
- Definition Classes
- Logging
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)