VariableLengthParameters

This class holds the parameters currently used for parsing variable-length records.

isRecordSequence: Does input files have 4 byte record length headers
bdw: Block descriptor word (if specified), for FB and VB record formats
isRdwBigEndian: Is RDW big endian? It may depend on flavor of mainframe and/or mainframe to PC transfer method
isRdwPartRecLength: Does RDW count itself as part of record length itself
rdwAdjustment: Controls a mismatch between RDW and record length
recordHeaderParser: An optional custom record header parser for non-standard RDWs
recordExtractor: An optional custom raw record parser class non-standard record types
rhpAdditionalInfo: An optional additional option string passed to a custom record header parser
reAdditionalInfo: An optional additional option string passed to a custom record extractor
recordLengthField: A field that stores record length
recordLengthMap: A mapping between field value and record size.
fileStartOffset: A number of bytes to skip at the beginning of each file
fileEndOffset: A number of bytes to skip at the end of each file
generateRecordId: Generate a sequential record number for each record to be able to retain the order of the original data
isUsingIndex: Is indexing input file before processing is requested
inputSplitRecords: The number of records to include in each partition. Notice mainframe records may have variable size, inputSplitMB is the recommended option
inputSplitSizeMB: A partition size to target. In certain circumstances this size may not be exactly that, but the library will do the best effort to target that size
improveLocality: Tries to improve locality by extracting preferred locations for variable-length records
optimizeAllocation: Optimizes cluster usage in case of optimization for locality in the presence of new nodes (nodes that do not contain any blocks of the files being processed)
inputFileNameColumn: A column name to add to the dataframe. The column will contain input file name for each record similar to 'input_file_name()' function

Linear Supertypes

Serializable, Serializable, Product, Equals, AnyRef, Any

Instance Constructors

new VariableLengthParameters(isRecordSequence: Boolean, bdw: Option[Bdw], isRdwBigEndian: Boolean, isRdwPartRecLength: Boolean, rdwAdjustment: Int, recordHeaderParser: Option[String], recordExtractor: Option[String], rhpAdditionalInfo: Option[String], reAdditionalInfo: String, recordLengthField: String, recordLengthMap: Map[String, Int], fileStartOffset: Int, fileEndOffset: Int, generateRecordId: Boolean, isUsingIndex: Boolean, inputSplitRecords: Option[Int], inputSplitSizeMB: Option[Int], improveLocality: Boolean, optimizeAllocation: Boolean, inputFileNameColumn: String, occursMappings: Map[String, Map[String, Int]])

isRecordSequence
Does input files have 4 byte record length headers
bdw
Block descriptor word (if specified), for FB and VB record formats
isRdwBigEndian
Is RDW big endian? It may depend on flavor of mainframe and/or mainframe to PC transfer method
isRdwPartRecLength
Does RDW count itself as part of record length itself
rdwAdjustment
Controls a mismatch between RDW and record length
recordHeaderParser
An optional custom record header parser for non-standard RDWs
recordExtractor
An optional custom raw record parser class non-standard record types
rhpAdditionalInfo
An optional additional option string passed to a custom record header parser
reAdditionalInfo
An optional additional option string passed to a custom record extractor
recordLengthField
A field that stores record length
recordLengthMap
A mapping between field value and record size.
fileStartOffset
A number of bytes to skip at the beginning of each file
fileEndOffset
A number of bytes to skip at the end of each file
generateRecordId
Generate a sequential record number for each record to be able to retain the order of the original data
isUsingIndex
Is indexing input file before processing is requested
inputSplitRecords
The number of records to include in each partition. Notice mainframe records may have variable size, inputSplitMB is the recommended option
inputSplitSizeMB
A partition size to target. In certain circumstances this size may not be exactly that, but the library will do the best effort to target that size
improveLocality
Tries to improve locality by extracting preferred locations for variable-length records
optimizeAllocation
Optimizes cluster usage in case of optimization for locality in the presence of new nodes (nodes that do not contain any blocks of the files being processed)
inputFileNameColumn
A column name to add to the dataframe. The column will contain input file name for each record similar to 'input_file_name()' function

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
val bdw: Option[Bdw]

Block descriptor word (if specified), for FB and VB record formats
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
val fileEndOffset: Int

A number of bytes to skip at the end of each file
val fileStartOffset: Int

A number of bytes to skip at the beginning of each file
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
val generateRecordId: Boolean

Generate a sequential record number for each record to be able to retain the order of the original data
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
val improveLocality: Boolean

Tries to improve locality by extracting preferred locations for variable-length records
val inputFileNameColumn: String

A column name to add to the dataframe.
A column name to add to the dataframe. The column will contain input file name for each record similar to 'input_file_name()' function
val inputSplitRecords: Option[Int]

The number of records to include in each partition.
The number of records to include in each partition. Notice mainframe records may have variable size, inputSplitMB is the recommended option
val inputSplitSizeMB: Option[Int]

A partition size to target.
A partition size to target. In certain circumstances this size may not be exactly that, but the library will do the best effort to target that size
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val isRdwBigEndian: Boolean

Is RDW big endian? It may depend on flavor of mainframe and/or mainframe to PC transfer method
val isRdwPartRecLength: Boolean

Does RDW count itself as part of record length itself
val isRecordSequence: Boolean

Does input files have 4 byte record length headers
val isUsingIndex: Boolean

Is indexing input file before processing is requested
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
val occursMappings: Map[String, Map[String, Int]]
val optimizeAllocation: Boolean

Optimizes cluster usage in case of optimization for locality in the presence of new nodes (nodes that do not contain any blocks of the files being processed)
val rdwAdjustment: Int

Controls a mismatch between RDW and record length
val reAdditionalInfo: String

An optional additional option string passed to a custom record extractor
val recordExtractor: Option[String]

An optional custom raw record parser class non-standard record types
val recordHeaderParser: Option[String]

An optional custom record header parser for non-standard RDWs
val recordLengthField: String

A field that stores record length
val recordLengthMap: Map[String, Int]

A mapping between field value and record size.
val rhpAdditionalInfo: Option[String]

An optional additional option string passed to a custom record header parser
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Doc: package parameters

Instance Constructors

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

val bdw: Option[Bdw]

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

val fileEndOffset: Int

val fileStartOffset: Int

def finalize(): Unit

val generateRecordId: Boolean

final def getClass(): Class[_]

val improveLocality: Boolean

val inputFileNameColumn: String

val inputSplitRecords: Option[Int]

val inputSplitSizeMB: Option[Int]

final def isInstanceOf[T0]: Boolean

val isRdwBigEndian: Boolean

val isRdwPartRecLength: Boolean

val isRecordSequence: Boolean

val isUsingIndex: Boolean

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

val occursMappings: Map[String, Map[String, Int]]

val optimizeAllocation: Boolean

val rdwAdjustment: Int

val reAdditionalInfo: String

val recordExtractor: Option[String]

val recordHeaderParser: Option[String]

val recordLengthField: String

val recordLengthMap: Map[String, Int]

val rhpAdditionalInfo: Option[String]

final def synchronized[T0](arg0: ⇒ T0): T0

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped