Creates an rows that will read from the given hadoop path.
Creates an rows that will read from the given hadoop path.
where to load the data from
the schema as present in the metastore and used to match up with the raw data in dialects where the schema is not present. For example with a CSV format in Hive, the metastoreSchema is required in order to know what each column represents. We can't use the projection schema for this because the projection schema might be in a different order.
the schema required to read. This might not be the full schema present in the data but is required here because some file schemas can read data more efficiently if they know they can omit some fields (eg Parquet).
used by some implementations to filter data at a file read level (eg Parquet) The dataSchema represents the schema that was written for the data files. This won't necessarily be the same as the hive metastore schema, because partition values are not written to the data files. We must include this here because some hive formats don't store schema information with the data, eg delimited files. The readerSchema is the schema required by the caller which may be the same as the written data, or it may be a subset if a projection pushdown is being used.