Class

org.apache.spark.sql.execution.datasources

InMemoryFileIndex

Related Doc: package datasources

Permalink

class InMemoryFileIndex extends PartitioningAwareFileIndex

A FileIndex that generates the list of files to process by recursively listing all the files present in paths.

Linear Supertypes
PartitioningAwareFileIndex, Logging, FileIndex, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. InMemoryFileIndex
  2. PartitioningAwareFileIndex
  3. Logging
  4. FileIndex
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new InMemoryFileIndex(sparkSession: SparkSession, rootPaths: Seq[Path], parameters: Map[String, String], partitionSchema: Option[StructType], fileStatusCache: FileStatusCache = NoopCache)

    Permalink

    rootPaths

    the list of root table paths to scan

    parameters

    as set of options to control discovery

    partitionSchema

    an optional partition schema that will be use to provide types for the discovered partitions

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def allFiles(): Seq[FileStatus]

    Permalink
    Definition Classes
    PartitioningAwareFileIndex
  5. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  6. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  7. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  8. def equals(other: Any): Boolean

    Permalink
    Definition Classes
    InMemoryFileIndex → AnyRef → Any
  9. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  11. val hadoopConf: Configuration

    Permalink
    Attributes
    protected
    Definition Classes
    PartitioningAwareFileIndex
  12. def hashCode(): Int

    Permalink
    Definition Classes
    InMemoryFileIndex → AnyRef → Any
  13. def inferPartitioning(): PartitionSpec

    Permalink
    Attributes
    protected
    Definition Classes
    PartitioningAwareFileIndex
  14. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  15. def inputFiles: Array[String]

    Permalink

    Returns the list of files that will be read when scanning this relation.

    Returns the list of files that will be read when scanning this relation.

    Definition Classes
    PartitioningAwareFileIndexFileIndex
  16. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  17. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  18. def leafDirToChildrenFiles: Map[Path, Array[FileStatus]]

    Permalink
    Attributes
    protected
    Definition Classes
    InMemoryFileIndexPartitioningAwareFileIndex
  19. def leafFiles: LinkedHashMap[Path, FileStatus]

    Permalink
    Attributes
    protected
    Definition Classes
    InMemoryFileIndexPartitioningAwareFileIndex
  20. def listFiles(filters: Seq[Expression]): Seq[PartitionDirectory]

    Permalink

    Returns all valid files grouped into partitions when the data is partitioned.

    Returns all valid files grouped into partitions when the data is partitioned. If the data is unpartitioned, this will return a single partition with no partition values.

    filters

    The filters used to prune which partitions are returned. These filters must only refer to partition columns and this method will only return files where these predicates are guaranteed to evaluate to true. Thus, these filters will not need to be evaluated again on the returned data.

    Definition Classes
    PartitioningAwareFileIndexFileIndex
  21. def listLeafFiles(paths: Seq[Path]): LinkedHashSet[FileStatus]

    Permalink

    List leaf files of given paths.

    List leaf files of given paths. This method will submit a Spark job to do parallel listing whenever there is a path having more files than the parallel partition discovery discovery threshold.

    This is publicly visible for testing.

    Definition Classes
    PartitioningAwareFileIndex
  22. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  23. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  24. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  25. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  26. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  27. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  28. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  29. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  30. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  31. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  32. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  33. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  34. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  35. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  36. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  37. def partitionSchema: StructType

    Permalink

    Schema of the partitioning columns, or the empty schema if the table is not partitioned.

    Schema of the partitioning columns, or the empty schema if the table is not partitioned.

    Definition Classes
    PartitioningAwareFileIndexFileIndex
  38. def partitionSpec(): PartitionSpec

    Permalink

    Returns the specification of the partitions inferred from the data.

    Returns the specification of the partitions inferred from the data.

    Definition Classes
    InMemoryFileIndexPartitioningAwareFileIndex
  39. def refresh(): Unit

    Permalink

    Refresh any cached file listings

    Refresh any cached file listings

    Definition Classes
    InMemoryFileIndexFileIndex
  40. val rootPaths: Seq[Path]

    Permalink

    the list of root table paths to scan

    the list of root table paths to scan

    Definition Classes
    InMemoryFileIndexFileIndex
  41. def sizeInBytes: Long

    Permalink

    Sum of table file sizes, in bytes

    Sum of table file sizes, in bytes

    Definition Classes
    PartitioningAwareFileIndexFileIndex
  42. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  43. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  44. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  45. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  46. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Logging

Inherited from FileIndex

Inherited from AnyRef

Inherited from Any

Ungrouped