java.lang.Object
- org.apache.flink.orc.AbstractOrcFileInputFormat<T,BatchT,SplitT>

Type Parameters:

T - The type of records produced by this reader format.

All Implemented Interfaces:

Serializable, org.apache.flink.api.java.typeutils.ResultTypeQueryable<T>, org.apache.flink.connector.file.src.reader.BulkFormat<T,SplitT>

Direct Known Subclasses:

OrcColumnarRowInputFormat
```
public abstract class AbstractOrcFileInputFormat<T,BatchT,SplitT extends org.apache.flink.connector.file.src.FileSourceSplit>
extends Object
implements org.apache.flink.connector.file.src.reader.BulkFormat<T,SplitT>
```
The base for ORC readers for the FileSource. Implements the reader initialization, vectorized reading, and pooling of column vector objects.
Subclasses implement the conversion to the specific result record(s) that they return by creating via extending AbstractOrcFileInputFormat.OrcReaderBatch.

See Also:

Serialized Form

Nested Class Summary

Nested Classes
Modifier and Type	Class	Description
`protected static class`	`AbstractOrcFileInputFormat.OrcReaderBatch<T,BatchT>`	The `OrcReaderBatch` class holds the data structures containing the batch data (column vectors, row arrays, ...) and performs the batch conversion from the ORC representation to the result format.
`protected static class`	`AbstractOrcFileInputFormat.OrcVectorizedReader<T,BatchT>`	A vectorized ORC reader.

Nested classes/interfaces inherited from interface org.apache.flink.connector.file.src.reader.BulkFormat
org.apache.flink.connector.file.src.reader.BulkFormat.Reader<T extends Object>, org.apache.flink.connector.file.src.reader.BulkFormat.RecordIterator<T extends Object>

Field Summary

Fields
Modifier and Type	Field	Description
`protected int`	`batchSize`
`protected List<OrcFilters.Predicate>`	`conjunctPredicates`
`protected SerializableHadoopConfigWrapper`	`hadoopConfigWrapper`
`protected org.apache.orc.TypeDescription`	`schema`
`protected int[]`	`selectedFields`
`protected OrcShim<BatchT>`	`shim`

Constructor Summary

Constructors
Modifier	Constructor	Description
`protected`	`AbstractOrcFileInputFormat(OrcShim<BatchT> shim, org.apache.hadoop.conf.Configuration hadoopConfig, org.apache.orc.TypeDescription schema, int[] selectedFields, List<OrcFilters.Predicate> conjunctPredicates, int batchSize)`

Method Summary

All Methods Instance Methods Abstract Methods Concrete Methods
Modifier and Type	Method	Description
`AbstractOrcFileInputFormat.OrcVectorizedReader<T,BatchT>`	`createReader(org.apache.flink.configuration.Configuration config, SplitT split)`
`abstract AbstractOrcFileInputFormat.OrcReaderBatch<T,BatchT>`	`createReaderBatch(SplitT split, OrcVectorizedBatchWrapper<BatchT> orcBatch, org.apache.flink.connector.file.src.util.Pool.Recycler<AbstractOrcFileInputFormat.OrcReaderBatch<T,BatchT>> recycler, int batchSize)`	Creates the `AbstractOrcFileInputFormat.OrcReaderBatch` structure, which is responsible for holding the data structures that hold the batch data (column vectors, row arrays, ...) and the batch conversion from the ORC representation to the result format.
`abstract org.apache.flink.api.common.typeinfo.TypeInformation<T>`	`getProducedType()`	Gets the type produced by this format.
`boolean`	`isSplittable()`
`AbstractOrcFileInputFormat.OrcVectorizedReader<T,BatchT>`	`restoreReader(org.apache.flink.configuration.Configuration config, SplitT split)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

shim
```
protected final OrcShim<BatchT> shim
```

hadoopConfigWrapper

protected final SerializableHadoopConfigWrapper hadoopConfigWrapper

schema

protected final org.apache.orc.TypeDescription schema

selectedFields
```
protected final int[] selectedFields
```

conjunctPredicates

protected final List<OrcFilters.Predicate> conjunctPredicates

batchSize
```
protected final int batchSize
```

Constructor Detail

AbstractOrcFileInputFormat

protected AbstractOrcFileInputFormat(OrcShim<BatchT> shim,
                                     org.apache.hadoop.conf.Configuration hadoopConfig,
                                     org.apache.orc.TypeDescription schema,
                                     int[] selectedFields,
                                     List<OrcFilters.Predicate> conjunctPredicates,
                                     int batchSize)

Parameters:: shim - the shim for various Orc dependent versions. If you use the latest version, please use OrcShim.defaultShim() directly.; hadoopConfig - the hadoop config for orc reader.; schema - the full schema of orc format.; selectedFields - the read selected field of orc format.; conjunctPredicates - the filter predicates that can be evaluated.; batchSize - the batch size of orc reader.

Method Detail

createReader

public AbstractOrcFileInputFormat.OrcVectorizedReader<T,BatchT> createReader(org.apache.flink.configuration.Configuration config,
                                                                                   SplitT split)
                                                                            throws IOException

Specified by:: createReader in interface org.apache.flink.connector.file.src.reader.BulkFormat<T,BatchT>
Throws:: IOException

restoreReader

public AbstractOrcFileInputFormat.OrcVectorizedReader<T,BatchT> restoreReader(org.apache.flink.configuration.Configuration config,
                                                                                    SplitT split)
                                                                             throws IOException

Specified by:: restoreReader in interface org.apache.flink.connector.file.src.reader.BulkFormat<T,BatchT>
Throws:: IOException

isSplittable
```
public boolean isSplittable()
```
Specified by:

isSplittable in interface org.apache.flink.connector.file.src.reader.BulkFormat<T,BatchT>

createReaderBatch

public abstract AbstractOrcFileInputFormat.OrcReaderBatch<T,BatchT> createReaderBatch(SplitT split,
                                                                                            OrcVectorizedBatchWrapper<BatchT> orcBatch,
                                                                                            org.apache.flink.connector.file.src.util.Pool.Recycler<AbstractOrcFileInputFormat.OrcReaderBatch<T,BatchT>> recycler,
                                                                                            int batchSize)

Creates the AbstractOrcFileInputFormat.OrcReaderBatch structure, which is responsible for holding the data structures that hold the batch data (column vectors, row arrays, ...) and the batch conversion from the ORC representation to the result format.

getProducedType
```
public abstract org.apache.flink.api.common.typeinfo.TypeInformation<T> getProducedType()
```
Gets the type produced by this format.

Specified by:

getProducedType in interface org.apache.flink.connector.file.src.reader.BulkFormat<T,BatchT>

Specified by:

getProducedType in interface org.apache.flink.api.java.typeutils.ResultTypeQueryable<T>

Class AbstractOrcFileInputFormat<T,​BatchT,​SplitT extends org.apache.flink.connector.file.src.FileSourceSplit>

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.flink.connector.file.src.reader.BulkFormat

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

shim

hadoopConfigWrapper

schema

selectedFields

conjunctPredicates

batchSize

Constructor Detail

AbstractOrcFileInputFormat

Method Detail

createReader

restoreReader

isSplittable

createReaderBatch

getProducedType

Class AbstractOrcFileInputFormat<T,BatchT,SplitT extends org.apache.flink.connector.file.src.FileSourceSplit>