Packages

abstract class StreamExecution extends StreamingQuery with Logging

Manages the execution of a streaming Spark SQL query that is occurring in a separate thread. Unlike a standard query, a streaming query executes repeatedly each time new data arrives at any Source present in the query plan. Whenever new data arrives, a QueryExecution is created and the results are committed transactionally to the given Sink.

Linear Supertypes
Logging, StreamingQuery, StreamingQuery, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. StreamExecution
  2. Logging
  3. StreamingQuery
  4. StreamingQuery
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Instance Constructors

  1. new StreamExecution(sparkSession: classic.SparkSession, name: String, resolvedCheckpointRoot: String, analyzedPlan: LogicalPlan, sink: Table, trigger: Trigger, triggerClock: Clock, outputMode: OutputMode, deleteCheckpointOnStop: Boolean)

    deleteCheckpointOnStop

    whether to delete the checkpoint if the query is stopped without errors. Checkpoint deletion can be forced with the appropriate Spark configuration.

Type Members

  1. implicit class LogStringContext extends AnyRef
    Definition Classes
    Logging

Abstract Value Members

  1. abstract def getLatestExecutionContext(): StreamExecutionContext

    Get the latest execution context .

  2. abstract def logicalPlan: LogicalPlan

    The base logical plan which will be used across batch runs.

    The base logical plan which will be used across batch runs. Once the value is set, it should not be modified.

  3. abstract def runActivatedStream(sparkSessionForStream: classic.SparkSession): Unit

    Run the activated stream until stopped.

    Run the activated stream until stopped.

    Attributes
    protected
  4. abstract def sources: Seq[SparkDataStream]

    The list of stream instances which will be used across batch runs.

    The list of stream instances which will be used across batch runs. Once the value is set, it should not be modified.

    Attributes
    protected
  5. abstract def stop(): Unit
    Definition Classes
    StreamingQuery
    Annotations
    @throws(scala.this.throws.<init>$default$1[java.util.concurrent.TimeoutException])

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. val analyzedPlan: LogicalPlan
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. def availableOffsets: StreamProgress

    Get the end or formerly know as "available" offsets of the latest batch that has been planned

  7. def awaitInitialization(timeoutMs: Long): Unit

    Await until all fields of the query have been initialized.

  8. val awaitProgressLock: ReentrantLock

    A lock used to wait/notify when batches complete.

    A lock used to wait/notify when batches complete. Use a fair lock to avoid thread starvation.

    Attributes
    protected
  9. val awaitProgressLockCondition: Condition
    Attributes
    protected
  10. def awaitTermination(timeoutMs: Long): Boolean
    Definition Classes
    StreamExecution → StreamingQuery
  11. def awaitTermination(): Unit
    Definition Classes
    StreamExecution → StreamingQuery
  12. def checkpointFile(name: String): String

    Returns the path of a file with name in the checkpoint directory.

    Returns the path of a file with name in the checkpoint directory.

    Attributes
    protected
  13. def cleanup(): Unit

    Any clean up that needs to happen when the query is stopped or exits

    Any clean up that needs to happen when the query is stopped or exits

    Attributes
    protected
  14. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
  15. val commitLog: CommitLog

    A log that records the batch ids that have completed.

    A log that records the batch ids that have completed. This is used to check if a batch was fully processed, and its output was committed to the sink, hence no need to process it again. This is used (for instance) during restart, to help identify which batch to run next.

  16. var committedOffsets: StreamProgress

    Tracks how much data we have processed and committed to the sink or state store from each input source.

    Tracks how much data we have processed and committed to the sink or state store from each input source. Only the scheduler thread should modify this field, and only in atomic steps. Other threads should make a shallow copy if they are going to access this field more than once, since the field's value may change at any time.

  17. def createWrite(table: SupportsWrite, options: Map[String, String], inputPlan: LogicalPlan): Write
    Attributes
    protected
  18. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  19. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  20. def exception: Option[StreamingQueryException]

    Returns the StreamingQueryException if the query was terminated by an exception.

    Returns the StreamingQueryException if the query was terminated by an exception.

    Definition Classes
    StreamExecution → StreamingQuery
  21. def explain(): Unit
    Definition Classes
    StreamExecution → StreamingQuery
  22. def explain(extended: Boolean): Unit
    Definition Classes
    StreamExecution → StreamingQuery
  23. def explainInternal(extended: Boolean): String

    Expose for tests

  24. def getBatchDescriptionString: String
    Attributes
    protected
  25. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @IntrinsicCandidate() @native()
  26. def getStartOffsetsOfLatestBatch: StreamProgress

    Get the start offsets of the latest batch that has been planned

  27. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @IntrinsicCandidate() @native()
  28. val id: UUID
    Definition Classes
    StreamExecution → StreamingQuery
  29. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  30. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  31. def interruptAndAwaitExecutionThreadTermination(): Unit

    Interrupts the query execution thread and awaits its termination until until it exceeds the timeout.

    Interrupts the query execution thread and awaits its termination until until it exceeds the timeout. The timeout can be set on "spark.sql.streaming.stopTimeout".

    Attributes
    protected
    Annotations
    @throws(scala.this.throws.<init>$default$1[java.util.concurrent.TimeoutException])
    Exceptions thrown

    TimeoutException If the thread cannot be stopped within the timeout

  32. def isActive: Boolean

    Whether the query is currently active or not

    Whether the query is currently active or not

    Definition Classes
    StreamExecution → StreamingQuery
  33. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  34. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  35. def lastExecution: IncrementalExecution
  36. def lastProgress: StreamingQueryProgress
    Definition Classes
    StreamExecution → StreamingQuery
  37. def latestOffsets: StreamProgress
  38. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  39. def logDebug(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  40. def logDebug(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  41. def logDebug(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    Logging
  42. def logDebug(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  43. def logError(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  44. def logError(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  45. def logError(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    Logging
  46. def logError(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  47. def logInfo(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  48. def logInfo(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  49. def logInfo(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    Logging
  50. def logInfo(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  51. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  52. def logTrace(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  53. def logTrace(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  54. def logTrace(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    Logging
  55. def logTrace(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  56. def logWarning(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  57. def logWarning(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  58. def logWarning(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    Logging
  59. def logWarning(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  60. var loggingThreadContext: Instance
    Attributes
    protected
  61. val minLogEntriesToMaintain: Int
    Attributes
    protected
  62. val name: String
    Definition Classes
    StreamExecution → StreamingQuery
  63. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  64. var noNewData: Boolean

    A flag to indicate that a batch has completed with no new data available.

    A flag to indicate that a batch has completed with no new data available.

    Attributes
    protected
  65. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @IntrinsicCandidate() @native()
  66. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @IntrinsicCandidate() @native()
  67. val offsetLog: OffsetSeqLog

    A write-ahead-log that records the offsets that are present in each batch.

    A write-ahead-log that records the offsets that are present in each batch. In order to ensure that a given batch will always consist of the same data, we write to this log *before* any processing is done. Thus, the Nth record in this log indicated data that is currently being processed and the N-1th entry indicates which offsets have been durably committed to the sink.

  68. val outputMode: OutputMode
  69. val pollingDelayMs: Long
    Attributes
    protected
  70. def postEvent(event: Event): Unit
    Attributes
    protected
  71. val prettyIdString: String

    Pretty identified string of printing in logs.

    Pretty identified string of printing in logs. Format is If name is set "queryName [id = xyz, runId = abc]" else "[id = xyz, runId = abc]"

    Attributes
    protected
  72. def processAllAvailable(): Unit
    Definition Classes
    StreamExecution → StreamingQuery
  73. val progressReporter: ProgressReporter
    Attributes
    protected
  74. def purge(threshold: Long): Unit
    Attributes
    protected
  75. def purgeStatefulMetadata(plan: SparkPlan): Unit
    Attributes
    protected
  76. val queryExecutionThread: QueryExecutionThread

    The thread that runs the micro-batches of this stream.

    The thread that runs the micro-batches of this stream. Note that this thread must be org.apache.spark.util.UninterruptibleThread to workaround KAFKA-1894: interrupting a running KafkaConsumer may cause endless loop.

  77. def recentProgress: Array[StreamingQueryProgress]
    Definition Classes
    StreamExecution → StreamingQuery
  78. val resolvedCheckpointRoot: String
  79. val runId: UUID
    Definition Classes
    StreamExecution → StreamingQuery
  80. val sink: Table
  81. val sparkSession: classic.SparkSession

    <invalid inheritdoc annotation>

    <invalid inheritdoc annotation>

    Definition Classes
    StreamExecutionStreamingQuery → StreamingQuery
  82. val sparkSessionForStream: classic.SparkSession

    Isolated spark session to run the batches with.

    Isolated spark session to run the batches with.

    Attributes
    protected
  83. def start(): Unit

    Starts the execution.

    Starts the execution. This returns only after the thread has started and QueryStartedEvent has been posted to all the listeners.

  84. val state: AtomicReference[State]

    Defines the internal state of execution

    Defines the internal state of execution

    Attributes
    protected
  85. def status: StreamingQueryStatus
    Definition Classes
    StreamExecution → StreamingQuery
  86. def stopSources(): Unit

    Stops all streaming sources safely.

    Stops all streaming sources safely.

    Attributes
    protected
  87. var streamDeathCause: StreamingQueryException
    Attributes
    protected
  88. val streamMetadata: StreamMetadata

    Metadata associated with the whole query

    Metadata associated with the whole query

    Attributes
    protected
  89. lazy val streamMetrics: MetricsReporter

    Used to report metrics to coda-hale.

    Used to report metrics to coda-hale. This uses id for easier tracking across restarts.

  90. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  91. def toString(): String
    Definition Classes
    StreamExecution → AnyRef → Any
  92. val trigger: Trigger
  93. val triggerClock: Clock
  94. var uniqueSources: Map[SparkDataStream, ReadLimit]

    A list of unique sources in the query plan.

    A list of unique sources in the query plan. This will be set when generating logical plan.

    Attributes
    protected
  95. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  96. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()
  97. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  98. val watermarkMsMap: Map[Int, Long]

    A map of current watermarks, keyed by the position of the watermark operator in the physical plan.

    A map of current watermarks, keyed by the position of the watermark operator in the physical plan.

    This state is 'soft state', which does not affect the correctness and semantics of watermarks and is not persisted across query restarts. The fault-tolerant watermark state is in offsetSeqMetadata.

    Attributes
    protected
  99. def withLogContext(context: Map[String, String])(body: => Unit): Unit
    Attributes
    protected
    Definition Classes
    Logging

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable]) @Deprecated
    Deprecated

    (Since version 9)

Inherited from Logging

Inherited from StreamingQuery

Inherited from StreamingQuery

Inherited from AnyRef

Inherited from Any

Ungrouped