This class holds parameters for the job.
This class holds the parameters currently used for parsing variable-length records.
This class holds the parameters currently used for parsing variable-length records.
Does input files have 4 byte record length headers
Is RDW big endian? It may depend on flavor of mainframe and/or mainframe to PC transfer method
Does RDW count itself as part of record length itself
Controls a mismatch between RDW and record length
An optional custom record header parser for non-standard RDWs
An optional additional option string passed to a custom record header parser
A field that stores record length
A number of bytes to skip at the beginning of each file
A number of bytes to skip at the end of each file
If true, OCCURS DEPENDING ON data size will depend on the number of elements
Generate a sequential record number for each record to be able to retain the order of the original data
Is indexing input file before processing is requested
The number of records to include in each partition. Notice mainframe records may have variable size, inputSplitMB is the recommended option
A partition size to target. In certain circumstances this size may not be exactly that, but the library will do the best effort to target that size
Tries to improve locality by extracting preferred locations for variable-length records
Optimizes cluster usage in case of optimization for locality in the presence of new nodes (nodes that do not contain any blocks of the files being processed)
A column name to add to the dataframe. The column will contain input file name for each record similar to 'input_file_name()' function
This class provides methods for parsing the parameters set as Spark options.
This class provides methods for checking the Spark job options after parsed.
This class holds parameters for the job.
String containing the path to the copybook in a given file system.
Sequence containing the paths to the copybooks.
String containing the actual content of the copybook. Either this, the copybookPath, or multiCopybookPath parameter must be specified.
String containing the path to the Cobol file to be parsed.
If true the input data file encoding is EBCDIC, otherwise it is ASCII
Specifies what code page to use for EBCDIC to ASCII/Unicode conversions
An optional custom code page conversion class provided by a user
A charset for ASCII data
A format of floating-point numbers
A number of bytes to skip at the beginning of the record before parsing a record according to a copybook
A number of bytes to skip at the end of each record
VariableLengthParameters containing the specifications for the consumption of variable-length Cobol records.
A copybook usually has a root group struct element that acts like a rowtag in XML. This can be retained in Spark schema or can be collapsed
Specify if and how strings should be trimmed when parsed
Parameters for reading multisegment mainframe files
A comment truncation policy
If true the parser will drop all FILLER fields, even GROUP FILLERS that have non-FILLER nested fields
A list of non-terminals (GROUPS) to combine and parse as primitive fields
If true the fixed length file reader won't check file size divisibility. Useful for debugging binary file / copybook mismatches.