Class ScanBuilderImpl
- All Implemented Interfaces:
ScanBuilder
ScanBuilder
.-
Constructor Summary
ConstructorsConstructorDescriptionScanBuilderImpl
(Path dataPath, Protocol protocol, Metadata metadata, StructType snapshotSchema, LogReplay logReplay, SnapshotReport snapshotReport) -
Method Summary
Modifier and TypeMethodDescriptionbuild()
withFilter
(Predicate predicate) Apply the given filter expression to prune any files that do not possibly contain the data that satisfies the given filter.withReadSchema
(StructType readSchema) Apply the given readSchema.
-
Constructor Details
-
ScanBuilderImpl
public ScanBuilderImpl(Path dataPath, Protocol protocol, Metadata metadata, StructType snapshotSchema, LogReplay logReplay, SnapshotReport snapshotReport)
-
-
Method Details
-
withFilter
Description copied from interface:ScanBuilder
Apply the given filter expression to prune any files that do not possibly contain the data that satisfies the given filter.Kernel makes use of the scan file partition values (for partitioned tables) and file-level column statistics (min, max, null count etc.) in the Delta metadata for filtering. Sometimes these metadata is not enough to deterministically say a scan file doesn't contain data that satisfies the filter.
E.g. given filter is
a = 2
. In file A, columna
has min value as -40 and max value as 200. In file B, columna
has min value as 78 and max value as 323. File B can be ruled out as it cannot possibly have rows where `a = 2`, but file A cannot be ruled out as it may contain rows wherea = 2
.As filtering is a best effort, the
Scan
object may return scan files (throughScan.getScanFiles(Engine)
) that does not satisfy the filter. It is the responsibility of the caller to apply the remaining filter returned byScan.getRemainingFilter()
to the data read from the scan files (returned byScan.getScanFiles(Engine)
) to completely filter out the data that doesn't satisfy the filter.```- Specified by:
withFilter
in interfaceScanBuilder
- Parameters:
predicate
- aPredicate
to prune the metadata or data.- Returns:
- A
ScanBuilder
with filter applied.
-
withReadSchema
Description copied from interface:ScanBuilder
Apply the given readSchema. If the builder already has a projection applied, calling this again replaces the existing projection.- Specified by:
withReadSchema
in interfaceScanBuilder
- Parameters:
readSchema
- Subset of columns to read from the Delta table.- Returns:
- A
ScanBuilder
with projection pruning.
-
build
- Specified by:
build
in interfaceScanBuilder
- Returns:
- Build the
instance
-