Class ScanImpl

Object
io.delta.kernel.internal.ScanImpl
All Implemented Interfaces:
Scan

public class ScanImpl extends Object implements Scan
Implementation of Scan
  • Constructor Details

  • Method Details

    • getScanFiles

      public CloseableIterator<FilteredColumnarBatch> getScanFiles(Engine engine)
      Get an iterator of data files in this version of scan that survived the predicate pruning.
      Specified by:
      getScanFiles in interface Scan
      Parameters:
      engine - Engine instance to use in Delta Kernel.
      Returns:
      data in ColumnarBatch batch format. Each row correspond to one survived file.
      See Also:
    • getScanFiles

      public CloseableIterator<FilteredColumnarBatch> getScanFiles(Engine engine, boolean includeStats)
      Get an iterator of data files in this version of scan that survived the predicate pruning.

      When includeStats=true the JSON file statistics are always read from the log and included in the returned columnar batches which have schema InternalScanFileUtils.SCAN_FILE_SCHEMA_WITH_STATS. When includeStats=false the JSON file statistics may or may not be present in the returned columnar batches.

      Parameters:
      engine - the Engine instance to use
      includeStats - whether to read and include the JSON statistics
      Returns:
      the surviving scan files as FilteredColumnarBatchs
    • getScanState

      public Row getScanState(Engine engine)
      Description copied from interface: Scan
      Get the scan state associated with the current scan. This state is common across all files in the scan to be read.
      Specified by:
      getScanState in interface Scan
      Parameters:
      engine - Engine instance to use in Delta Kernel.
      Returns:
      Scan state in Row format.
    • getRemainingFilter

      public Optional<Predicate> getRemainingFilter()
      Description copied from interface: Scan
      Get the remaining filter that is not guaranteed to be satisfied for the data Delta Kernel returns. This filter is used by Delta Kernel to do data skipping when possible.
      Specified by:
      getRemainingFilter in interface Scan
      Returns:
      the remaining filter as a Predicate.