Class FileSystemInputFile

java.lang.Object
com.jerolba.carpet.io.FileSystemInputFile
All Implemented Interfaces:
org.apache.parquet.io.InputFile

public class FileSystemInputFile extends Object implements org.apache.parquet.io.InputFile
In comparison to the default implementation provided by Apache Parquet HadoopInputFile, this implementation is specific to reading Parquet files from the file system, whereas Apache Parquet provides a more generic implementation that allows reading Parquet files from any data source, not just the file system. Some code is inspired from: https://github.com/benwatson528/intellij-avro-parquet-plugin/blob/master/src/main/java/uk/co \ /hadoopathome/intellij/viewer/fileformat/LocalInputFile.java https://github.com/tideworks/arvo2parquet/blob/master/src/main/java/com/tideworks/data_load/io/InputFile.java
  • Constructor Details

    • FileSystemInputFile

      public FileSystemInputFile(File file)
      Constructs a FileSystemInputFile with the specified file.
      Parameters:
      file - the file to read from
  • Method Details

    • getLength

      public long getLength() throws IOException
      Returns the length of the file.
      Specified by:
      getLength in interface org.apache.parquet.io.InputFile
      Returns:
      the length of the file
      Throws:
      IOException - if an error occurs while getting the length of the file
    • newStream

      public org.apache.parquet.io.SeekableInputStream newStream() throws IOException
      Creates a new stream for reading from the file.
      Specified by:
      newStream in interface org.apache.parquet.io.InputFile
      Returns:
      a new SeekableInputStream for reading from the file
      Throws:
      IOException - if an error occurs while creating the stream