Class RowCsvInputFormat

  • All Implemented Interfaces:
    Serializable, org.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,​org.apache.flink.core.fs.FileInputSplit>, org.apache.flink.core.io.InputSplitSource<org.apache.flink.core.fs.FileInputSplit>

    public class RowCsvInputFormat
    extends AbstractCsvInputFormat<org.apache.flink.types.Row>
    Input format that reads csv into Row.

    Different from old csv org.apache.flink.api.java.io.RowCsvInputFormat: 1.New csv will emit this row (Fill null the remaining fields) when row is too short. But Old csv will skip this too short row. 2.New csv, escape char will be removed. But old csv will keep the escape char.

    These can be continuously improved in new csv input format: 1.New csv not support configure comment char. The comment char is "#". 2.New csv not support configure multi chars field delimiter. 3.New csv not support read first N, it will throw exception. 4.Only support configure line delimiter: "\r" or "\n" or "\r\n".

    See Also:
    Serialized Form
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  RowCsvInputFormat.Builder
      A builder for creating a RowCsvInputFormat.
      • Nested classes/interfaces inherited from class org.apache.flink.api.common.io.FileInputFormat

        org.apache.flink.api.common.io.FileInputFormat.FileBaseStatistics, org.apache.flink.api.common.io.FileInputFormat.InputSplitOpenThread
    • Field Summary

      • Fields inherited from class org.apache.flink.api.common.io.FileInputFormat

        currentSplit, enumerateNestedFiles, INFLATER_INPUT_STREAM_FACTORIES, minSplitSize, numSplits, openTimeout, READ_WHOLE_SPLIT_FLAG, splitLength, splitStart, stream, unsplittable
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static RowCsvInputFormat.Builder builder​(org.apache.flink.api.common.typeinfo.TypeInformation<org.apache.flink.types.Row> typeInfo, org.apache.flink.core.fs.Path... filePaths)
      Create a builder.
      org.apache.flink.types.Row nextRecord​(org.apache.flink.types.Row record)  
      void open​(org.apache.flink.core.fs.FileInputSplit split)  
      boolean reachedEnd()  
      • Methods inherited from class org.apache.flink.api.common.io.FileInputFormat

        acceptFile, close, configure, createInputSplits, decorateInputStream, extractFileExtension, getFilePaths, getFileStats, getFileStats, getInflaterInputStreamFactory, getInputSplitAssigner, getMinSplitSize, getNestedFileEnumeration, getNumSplits, getOpenTimeout, getSplitLength, getSplitStart, getStatistics, getSupportedCompressionFormats, registerInflaterInputStreamFactory, setFilePath, setFilePath, setFilePaths, setFilePaths, setFilesFilter, setMinSplitSize, setNestedFileEnumeration, setNumSplits, setOpenTimeout, testForUnsplittable, toString
      • Methods inherited from class org.apache.flink.api.common.io.RichInputFormat

        closeInputFormat, getRuntimeContext, openInputFormat, setRuntimeContext
    • Method Detail

      • open

        public void open​(org.apache.flink.core.fs.FileInputSplit split)
                  throws IOException
        Specified by:
        open in interface org.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,​org.apache.flink.core.fs.FileInputSplit>
        Overrides:
        open in class AbstractCsvInputFormat<org.apache.flink.types.Row>
        Throws:
        IOException
      • reachedEnd

        public boolean reachedEnd()
      • nextRecord

        public org.apache.flink.types.Row nextRecord​(org.apache.flink.types.Row record)
                                              throws IOException
        Throws:
        IOException
      • builder

        public static RowCsvInputFormat.Builder builder​(org.apache.flink.api.common.typeinfo.TypeInformation<org.apache.flink.types.Row> typeInfo,
                                                        org.apache.flink.core.fs.Path... filePaths)
        Create a builder.