Packages

c

org.apache.spark.sql.execution.streaming

MetadataLogFileIndex

class MetadataLogFileIndex extends PartitioningAwareFileIndex

A FileIndex that generates the list of files to processing by reading them from the metadata log files generated by the FileStreamSink.

Linear Supertypes
PartitioningAwareFileIndex, Logging, FileIndex, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. MetadataLogFileIndex
  2. PartitioningAwareFileIndex
  3. Logging
  4. FileIndex
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Instance Constructors

  1. new MetadataLogFileIndex(sparkSession: SparkSession, path: Path, parameters: Map[String, String], userSpecifiedSchema: Option[StructType])

    userSpecifiedSchema

    an optional user specified schema that will be use to provide types for the discovered partitions

Type Members

  1. implicit class LogStringContext extends AnyRef
    Definition Classes
    Logging

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def allFiles(): Seq[FileStatus]
    Definition Classes
    PartitioningAwareFileIndex
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  9. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @IntrinsicCandidate() @native()
  10. val hadoopConf: Configuration
    Attributes
    protected
    Definition Classes
    PartitioningAwareFileIndex
  11. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @IntrinsicCandidate() @native()
  12. def inferPartitioning(): PartitionSpec
    Attributes
    protected
    Definition Classes
    PartitioningAwareFileIndex
  13. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  14. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  15. def inputFiles: Array[String]

    Returns the list of files that will be read when scanning this relation.

    Returns the list of files that will be read when scanning this relation.

    Definition Classes
    PartitioningAwareFileIndexFileIndex
  16. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  17. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  18. val leafDirToChildrenFiles: Map[Path, Array[FileStatus]]
    Attributes
    protected
    Definition Classes
    MetadataLogFileIndexPartitioningAwareFileIndex
  19. val leafFiles: LinkedHashMap[Path, FileStatus]
    Attributes
    protected
    Definition Classes
    MetadataLogFileIndexPartitioningAwareFileIndex
  20. def listFiles(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[PartitionDirectory]

    Returns all valid files grouped into partitions when the data is partitioned.

    Returns all valid files grouped into partitions when the data is partitioned. If the data is unpartitioned, this will return a single partition with no partition values.

    partitionFilters

    The filters used to prune which partitions are returned. These filters must only refer to partition columns and this method will only return files where these predicates are guaranteed to evaluate to true. Thus, these filters will not need to be evaluated again on the returned data.

    dataFilters

    Filters that can be applied on non-partitioned columns. The implementation does not need to guarantee these filters are applied, i.e. the execution engine will ensure these filters are still applied on the returned files.

    Definition Classes
    PartitioningAwareFileIndexFileIndex
  21. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  22. def logDebug(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  23. def logDebug(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  24. def logDebug(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    Logging
  25. def logDebug(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  26. def logError(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  27. def logError(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  28. def logError(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    Logging
  29. def logError(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  30. def logInfo(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  31. def logInfo(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  32. def logInfo(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    Logging
  33. def logInfo(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  34. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  35. def logTrace(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  36. def logTrace(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  37. def logTrace(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    Logging
  38. def logTrace(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  39. def logWarning(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  40. def logWarning(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  41. def logWarning(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    Logging
  42. def logWarning(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  43. def matchPathPattern(file: FileStatus): Boolean
    Attributes
    protected
    Definition Classes
    PartitioningAwareFileIndex
  44. def metadataOpsTimeNs: Option[Long]

    Returns an optional metadata operation time, in nanoseconds, for listing files.

    Returns an optional metadata operation time, in nanoseconds, for listing files.

    We do file listing in query optimization (in order to get the proper statistics) and we want to account for file listing time in physical execution (as metrics). To do that, we save the file listing time in some implementations and physical execution calls it in this method to update the metrics.

    Definition Classes
    FileIndex
  45. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  46. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @IntrinsicCandidate() @native()
  47. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @IntrinsicCandidate() @native()
  48. def partitionSchema: StructType

    Schema of the partitioning columns, or the empty schema if the table is not partitioned.

    Schema of the partitioning columns, or the empty schema if the table is not partitioned.

    Definition Classes
    PartitioningAwareFileIndexFileIndex
  49. def partitionSpec(): PartitionSpec

    Returns the specification of the partitions inferred from the data.

    Returns the specification of the partitions inferred from the data.

    Definition Classes
    MetadataLogFileIndexPartitioningAwareFileIndex
  50. lazy val recursiveFileLookup: Boolean
    Attributes
    protected
    Definition Classes
    PartitioningAwareFileIndex
  51. def refresh(): Unit

    Refresh any cached file listings

    Refresh any cached file listings

    Definition Classes
    MetadataLogFileIndexFileIndex
  52. def rootPaths: Seq[Path]

    Returns the list of root input paths from which the catalog will get files.

    Returns the list of root input paths from which the catalog will get files. There may be a single root path from which partitions are discovered, or individual partitions may be specified by each path.

    Definition Classes
    MetadataLogFileIndexFileIndex
  53. def sizeInBytes: Long

    Sum of table file sizes, in bytes

    Sum of table file sizes, in bytes

    Definition Classes
    PartitioningAwareFileIndexFileIndex
  54. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  55. def toString(): String
    Definition Classes
    FileIndex → AnyRef → Any
  56. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  57. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()
  58. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  59. def withLogContext(context: HashMap[String, String])(body: => Unit): Unit
    Attributes
    protected
    Definition Classes
    Logging

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable]) @Deprecated
    Deprecated

    (Since version 9)

Inherited from Logging

Inherited from FileIndex

Inherited from AnyRef

Inherited from Any

Ungrouped