Class

za.co.absa.cobrix.spark.cobol.reader.parameters

ReaderParameters

Related Doc: package parameters

Permalink

case class ReaderParameters(isEbcdic: Boolean = true, ebcdicCodePage: String = "common", ebcdicCodePageClass: Option[String] = None, asciiCharset: String = "", floatingPointFormat: FloatingPointFormat = FloatingPointFormat.IBM, variableSizeOccurs: Boolean = false, lengthFieldName: Option[String] = None, isRecordSequence: Boolean = false, isRdwBigEndian: Boolean = false, isRdwPartRecLength: Boolean = false, rdwAdjustment: Int = 0, isIndexGenerationNeeded: Boolean = false, inputSplitRecords: Option[Int] = None, inputSplitSizeMB: Option[Int] = None, hdfsDefaultBlockSize: Option[Int] = None, startOffset: Int = 0, endOffset: Int = 0, fileStartOffset: Int = 0, fileEndOffset: Int = 0, generateRecordId: Boolean = false, schemaPolicy: SchemaRetentionPolicy = SchemaRetentionPolicy.KeepOriginal, stringTrimmingPolicy: StringTrimmingPolicy = StringTrimmingPolicy.TrimBoth, multisegment: Option[MultisegmentParameters] = None, commentPolicy: CommentPolicy = CommentPolicy(), dropGroupFillers: Boolean = false, nonTerminals: Seq[String] = Nil, recordHeaderParser: Option[String] = None, rhpAdditionalInfo: Option[String] = None, inputFileNameColumn: String = "") extends Product with Serializable

These are properties for customizing mainframe binary data reader.

isEbcdic

If true the input data file encoding is EBCDIC, otherwise it is ASCII

ebcdicCodePage

Specifies what code page to use for EBCDIC to ASCII/Unicode conversions

ebcdicCodePageClass

An optional custom code page conversion class provided by a user

asciiCharset

A charset for ASCII data

floatingPointFormat

A format of floating-point numbers

variableSizeOccurs

If true, OCCURS DEPENDING ON data size will depend on the number of elements

lengthFieldName

A name of a field that contains record length. Optional. If not set the copybook record length will be used.

isRecordSequence

Does input files have 4 byte record length headers

isRdwBigEndian

Is RDW big endian? It may depend on flavor of mainframe and/or mainframe to PC transfer method

isRdwPartRecLength

Does RDW count itself as part of record length itself

rdwAdjustment

Controls a mismatch between RDW and record length

isIndexGenerationNeeded

Is indexing input file before processing is requested

inputSplitRecords

The number of records to include in each partition. Notice mainframe records may have variable size, inputSplitMB is the recommended option

inputSplitSizeMB

A partition size to target. In certain circumstances this size may not be exactly that, but the library will do the best effort to target that size

hdfsDefaultBlockSize

Default HDFS block size for the HDFS filesystem used. This value is used as the default split size if inputSplitSizeMB is not specified

startOffset

An offset to the start of the record in each binary data block.

endOffset

An offset from the end of the record to the end of the binary data block.

fileStartOffset

A number of bytes to skip at the beginning of each file

fileEndOffset

A number of bytes to skip at the end of each file

generateRecordId

If true, a record id field will be prepended to each record.

schemaPolicy

Specifies a policy to transform the input schema. The default policy is to keep the schema exactly as it is in the copybook.

stringTrimmingPolicy

Specifies if and how strings should be trimmed when parsed.

multisegment

Parameters specific to reading multisegment files

commentPolicy

A comment truncation policy

dropGroupFillers

If true the parser will drop all FILLER fields, even GROUP FILLERS that have non-FILLER nested fields

nonTerminals

A list of non-terminals (GROUPS) to combine and parse as primitive fields

recordHeaderParser

A parser used to parse data field record headers

rhpAdditionalInfo

An optional additional option string passed to a custom record header parser

inputFileNameColumn

A column name to add to the dataframe. The column will contain input file name for each record similar to 'input_file_name()' function

Linear Supertypes
Serializable, Serializable, Product, Equals, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ReaderParameters
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ReaderParameters(isEbcdic: Boolean = true, ebcdicCodePage: String = "common", ebcdicCodePageClass: Option[String] = None, asciiCharset: String = "", floatingPointFormat: FloatingPointFormat = FloatingPointFormat.IBM, variableSizeOccurs: Boolean = false, lengthFieldName: Option[String] = None, isRecordSequence: Boolean = false, isRdwBigEndian: Boolean = false, isRdwPartRecLength: Boolean = false, rdwAdjustment: Int = 0, isIndexGenerationNeeded: Boolean = false, inputSplitRecords: Option[Int] = None, inputSplitSizeMB: Option[Int] = None, hdfsDefaultBlockSize: Option[Int] = None, startOffset: Int = 0, endOffset: Int = 0, fileStartOffset: Int = 0, fileEndOffset: Int = 0, generateRecordId: Boolean = false, schemaPolicy: SchemaRetentionPolicy = SchemaRetentionPolicy.KeepOriginal, stringTrimmingPolicy: StringTrimmingPolicy = StringTrimmingPolicy.TrimBoth, multisegment: Option[MultisegmentParameters] = None, commentPolicy: CommentPolicy = CommentPolicy(), dropGroupFillers: Boolean = false, nonTerminals: Seq[String] = Nil, recordHeaderParser: Option[String] = None, rhpAdditionalInfo: Option[String] = None, inputFileNameColumn: String = "")

    Permalink

    isEbcdic

    If true the input data file encoding is EBCDIC, otherwise it is ASCII

    ebcdicCodePage

    Specifies what code page to use for EBCDIC to ASCII/Unicode conversions

    ebcdicCodePageClass

    An optional custom code page conversion class provided by a user

    asciiCharset

    A charset for ASCII data

    floatingPointFormat

    A format of floating-point numbers

    variableSizeOccurs

    If true, OCCURS DEPENDING ON data size will depend on the number of elements

    lengthFieldName

    A name of a field that contains record length. Optional. If not set the copybook record length will be used.

    isRecordSequence

    Does input files have 4 byte record length headers

    isRdwBigEndian

    Is RDW big endian? It may depend on flavor of mainframe and/or mainframe to PC transfer method

    isRdwPartRecLength

    Does RDW count itself as part of record length itself

    rdwAdjustment

    Controls a mismatch between RDW and record length

    isIndexGenerationNeeded

    Is indexing input file before processing is requested

    inputSplitRecords

    The number of records to include in each partition. Notice mainframe records may have variable size, inputSplitMB is the recommended option

    inputSplitSizeMB

    A partition size to target. In certain circumstances this size may not be exactly that, but the library will do the best effort to target that size

    hdfsDefaultBlockSize

    Default HDFS block size for the HDFS filesystem used. This value is used as the default split size if inputSplitSizeMB is not specified

    startOffset

    An offset to the start of the record in each binary data block.

    endOffset

    An offset from the end of the record to the end of the binary data block.

    fileStartOffset

    A number of bytes to skip at the beginning of each file

    fileEndOffset

    A number of bytes to skip at the end of each file

    generateRecordId

    If true, a record id field will be prepended to each record.

    schemaPolicy

    Specifies a policy to transform the input schema. The default policy is to keep the schema exactly as it is in the copybook.

    stringTrimmingPolicy

    Specifies if and how strings should be trimmed when parsed.

    multisegment

    Parameters specific to reading multisegment files

    commentPolicy

    A comment truncation policy

    dropGroupFillers

    If true the parser will drop all FILLER fields, even GROUP FILLERS that have non-FILLER nested fields

    nonTerminals

    A list of non-terminals (GROUPS) to combine and parse as primitive fields

    recordHeaderParser

    A parser used to parse data field record headers

    rhpAdditionalInfo

    An optional additional option string passed to a custom record header parser

    inputFileNameColumn

    A column name to add to the dataframe. The column will contain input file name for each record similar to 'input_file_name()' function

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. val asciiCharset: String

    Permalink

    A charset for ASCII data

  6. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  7. val commentPolicy: CommentPolicy

    Permalink

    A comment truncation policy

  8. val dropGroupFillers: Boolean

    Permalink

    If true the parser will drop all FILLER fields, even GROUP FILLERS that have non-FILLER nested fields

  9. val ebcdicCodePage: String

    Permalink

    Specifies what code page to use for EBCDIC to ASCII/Unicode conversions

  10. val ebcdicCodePageClass: Option[String]

    Permalink

    An optional custom code page conversion class provided by a user

  11. val endOffset: Int

    Permalink

    An offset from the end of the record to the end of the binary data block.

  12. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  13. val fileEndOffset: Int

    Permalink

    A number of bytes to skip at the end of each file

  14. val fileStartOffset: Int

    Permalink

    A number of bytes to skip at the beginning of each file

  15. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  16. val floatingPointFormat: FloatingPointFormat

    Permalink

    A format of floating-point numbers

  17. val generateRecordId: Boolean

    Permalink

    If true, a record id field will be prepended to each record.

  18. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  19. val hdfsDefaultBlockSize: Option[Int]

    Permalink

    Default HDFS block size for the HDFS filesystem used.

    Default HDFS block size for the HDFS filesystem used. This value is used as the default split size if inputSplitSizeMB is not specified

  20. val inputFileNameColumn: String

    Permalink

    A column name to add to the dataframe.

    A column name to add to the dataframe. The column will contain input file name for each record similar to 'input_file_name()' function

  21. val inputSplitRecords: Option[Int]

    Permalink

    The number of records to include in each partition.

    The number of records to include in each partition. Notice mainframe records may have variable size, inputSplitMB is the recommended option

  22. val inputSplitSizeMB: Option[Int]

    Permalink

    A partition size to target.

    A partition size to target. In certain circumstances this size may not be exactly that, but the library will do the best effort to target that size

  23. val isEbcdic: Boolean

    Permalink

    If true the input data file encoding is EBCDIC, otherwise it is ASCII

  24. val isIndexGenerationNeeded: Boolean

    Permalink

    Is indexing input file before processing is requested

  25. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  26. val isRdwBigEndian: Boolean

    Permalink

    Is RDW big endian? It may depend on flavor of mainframe and/or mainframe to PC transfer method

  27. val isRdwPartRecLength: Boolean

    Permalink

    Does RDW count itself as part of record length itself

  28. val isRecordSequence: Boolean

    Permalink

    Does input files have 4 byte record length headers

  29. val lengthFieldName: Option[String]

    Permalink

    A name of a field that contains record length.

    A name of a field that contains record length. Optional. If not set the copybook record length will be used.

  30. val multisegment: Option[MultisegmentParameters]

    Permalink

    Parameters specific to reading multisegment files

  31. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  32. val nonTerminals: Seq[String]

    Permalink

    A list of non-terminals (GROUPS) to combine and parse as primitive fields

  33. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  34. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  35. val rdwAdjustment: Int

    Permalink

    Controls a mismatch between RDW and record length

  36. val recordHeaderParser: Option[String]

    Permalink

    A parser used to parse data field record headers

  37. val rhpAdditionalInfo: Option[String]

    Permalink

    An optional additional option string passed to a custom record header parser

  38. val schemaPolicy: SchemaRetentionPolicy

    Permalink

    Specifies a policy to transform the input schema.

    Specifies a policy to transform the input schema. The default policy is to keep the schema exactly as it is in the copybook.

  39. val startOffset: Int

    Permalink

    An offset to the start of the record in each binary data block.

  40. val stringTrimmingPolicy: StringTrimmingPolicy

    Permalink

    Specifies if and how strings should be trimmed when parsed.

  41. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  42. val variableSizeOccurs: Boolean

    Permalink

    If true, OCCURS DEPENDING ON data size will depend on the number of elements

  43. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  44. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  45. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped