CopybookParser

Type Members

type CopybookAST = Group
case class CopybookLine(level: Int, name: String, lineNumber: Int, modifiers: Map[String, String]) extends Product with Serializable
case class RecordBoundary(name: String, begin: Int, end: Int) extends Product with Serializable
case class StatementLine(lineNumber: Int, text: String) extends Product with Serializable
case class StatementTokens(lineNumber: Int, tokens: Array[String]) extends Product with Serializable

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def findCycleInAMap(m: Map[String, String]): List[String]

Finds a cycle in a parent-child relation map.
Finds a cycle in a parent-child relation map.
m
A mapping from field name to its parent field name.
returns
A list of fields in a cycle if there is one, an empty list otherwise
def getAllSegmentRedefines(schema: CopybookAST): List[Group]

Given an AST of a copybook returns the list of all segment redefine GROUPs
Given an AST of a copybook returns the list of all segment redefine GROUPs
schema
An AST as a set of copybook records
returns
A list of segment redefine GROUPs
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def getParentToChildrenMap(schema: CopybookAST): Map[String, Seq[Group]]

Given an AST of a copybook returns a map from segment redefines to their children
Given an AST of a copybook returns a map from segment redefines to their children
schema
An AST as a set of copybook records
returns
A map from segment redefines to their children
def getRootSegmentAST(schema: CopybookAST): CopybookAST

Given an AST of a copybook returns a new AST that does not contain child segments
Given an AST of a copybook returns a new AST that does not contain child segments
schema
An AST as a set of copybook records
returns
A list of segment redefine GROUPs
def getRootSegmentIds(segmentIdRedefineMap: Map[String, String], fieldParentMap: Map[String, String]): List[String]

Returns a a list of values of segment ids for the root segment.
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def logName: String

Attributes
protected
Definition Classes
Logging
def logger: Logger

Attributes
protected
Definition Classes
Logging
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def parse(copyBookContents: String, dataEncoding: Encoding = EBCDIC, dropGroupFillers: Boolean = false, dropValueFillers: Boolean = true, segmentRedefines: Seq[String] = Nil, fieldParentMap: Map[String, String] = HashMap[String, String](), stringTrimmingPolicy: StringTrimmingPolicy = StringTrimmingPolicy.TrimBoth, commentPolicy: CommentPolicy = CommentPolicy(), strictSignOverpunch: Boolean = true, improvedNullDetection: Boolean = false, ebcdicCodePage: CodePage = new CodePageCommon, asciiCharset: Charset = StandardCharsets.US_ASCII, isUtf16BigEndian: Boolean = true, floatingPointFormat: FloatingPointFormat = FloatingPointFormat.IBM, nonTerminals: Seq[String] = Nil, occursHandlers: Map[String, Map[String, Int]] = Map(), debugFieldsPolicy: DebugFieldsPolicy = DebugFieldsPolicy.NoDebug): Copybook

Tokenizes a Cobol Copybook contents and returns the AST.
Tokenizes a Cobol Copybook contents and returns the AST.
copyBookContents
A string containing all lines of a copybook
dataEncoding
Encoding of the data file (either ASCII/EBCDIC). The encoding of the copybook is expected to be ASCII.
dropGroupFillers
Drop groups marked as fillers from the output AST
dropValueFillers
Drop primitive fields marked as fillers from the output AST
segmentRedefines
A list of redefined fields that correspond to various segments. This needs to be specified for automatically resolving segment redefines.
fieldParentMap
A segment fields parent mapping
stringTrimmingPolicy
Specifies if and how strings should be trimmed when parsed
commentPolicy
Specifies a policy for comments truncation inside a copybook
strictSignOverpunch
If true sign overpunching is not allowed for unsigned numbers
improvedNullDetection
If true, string values that contain only zero bytes (0x0) will be considered null.
ebcdicCodePage
A code page for EBCDIC encoded data
asciiCharset
A charset for ASCII encoded data
isUtf16BigEndian
If true UTF-16 strings are considered big-endian.
floatingPointFormat
A format of floating-point numbers (IBM/IEEE754)
nonTerminals
A list of non-terminals that should be extracted as strings
debugFieldsPolicy
Specifies if debugging fields need to be added and what should they contain (false, hex, raw).
returns
Seq[Group] where a group is a record inside the copybook
def parseSimple(copyBookContents: String, dropGroupFillers: Boolean = false, dropValueFillers: Boolean = true, commentPolicy: CommentPolicy = CommentPolicy(), dropFillersFromAst: Boolean = false): Copybook

Tokenizes a Cobol Copybook contents and returns the AST.
Tokenizes a Cobol Copybook contents and returns the AST.
This method accepts arguments that affect only structure of the output AST.
copyBookContents
A string containing all lines of a copybook
dropGroupFillers
Drop GROUPs marked as fillers from the output AST (the name of this parameter is retained for compatibility, fields won't be actually removed from the AST unless dropFillersFromAst is set to true). When dropGroupFillers is set to true, FILLER fields will retain their names, and 'isFiller() = true' for FILLER GROUPs. When dropGroupFillers is set to false, FILLER fields will be renamed to 'FILLER_1, FILLER_2, ...' to retain uniqueness of names in the output schema.
dropValueFillers
Drop primitive fields marked as fillers from the output AST (the name of this parameter is retained for compatibility, fields won't be actually removed from the AST unless dropFillersFromAst is set to true). When dropValueFillers is set to true, FILLER fields will retain their names, and 'isFiller() = true' for FILLER primitive fields. When dropValueFillers is set to false, FILLER fields will be renamed to 'FILLER_P1, FILLER_P2, ...' to retain uniqueness of names in the output schema.
commentPolicy
Specifies a policy for comments truncation inside a copybook
dropFillersFromAst
If true, fillers are going to be dropped from AST according to dropGroupFillers and dropValueFillers. If false, fillers will remain in the AST, but still can be recognizable by 'isFiller()' method.
returns
Seq[Group] where a group is a record inside the copybook
def parseTree(enc: Encoding, copyBookContents: String, dropGroupFillers: Boolean, dropValueFillers: Boolean, segmentRedefines: Seq[String], fieldParentMap: Map[String, String], stringTrimmingPolicy: StringTrimmingPolicy, commentPolicy: CommentPolicy, strictSignOverpunch: Boolean, improvedNullDetection: Boolean, ebcdicCodePage: CodePage, asciiCharset: Charset, isUtf16BigEndian: Boolean, floatingPointFormat: FloatingPointFormat, nonTerminals: Seq[String], occursHandlers: Map[String, Map[String, Int]], debugFieldsPolicy: DebugFieldsPolicy): Copybook

Tokenizes a Cobol Copybook contents and returns the AST.
Tokenizes a Cobol Copybook contents and returns the AST.
enc
Encoding of the data file (either ASCII/EBCDIC). The encoding of the copybook is expected to be ASCII.
copyBookContents
A string containing all lines of a copybook
dropGroupFillers
Drop groups marked as fillers from the output AST
dropValueFillers
Drop primitive fields marked as fillers from the output AST
segmentRedefines
A list of redefined fields that correspond to various segments. This needs to be specified for automatically resolving segment redefines.
fieldParentMap
A segment fields parent mapping
stringTrimmingPolicy
Specifies if and how strings should be trimmed when parsed
commentPolicy
Specifies a policy for comments truncation inside a copybook
improvedNullDetection
If true, string values that contain only zero bytes (0x0) will be considered null.
ebcdicCodePage
A code page for EBCDIC encoded data
asciiCharset
A charset for ASCII encoded data
isUtf16BigEndian
If true UTF-16 strings are considered big-endian.
floatingPointFormat
A format of floating-point numbers (IBM/IEEE754)
nonTerminals
A list of non-terminals that should be extracted as strings
debugFieldsPolicy
Specifies if debugging fields need to be added and what should they contain (false, hex, raw).
returns
Seq[Group] where a group is a record inside the copybook

Annotations
@throws( classOf[SyntaxErrorException] )
def parseTree(copyBookContents: String, dropGroupFillers: Boolean = false, dropValueFillers: Boolean = true, segmentRedefines: Seq[String] = Nil, fieldParentMap: Map[String, String] = HashMap[String, String](), stringTrimmingPolicy: StringTrimmingPolicy = StringTrimmingPolicy.TrimBoth, commentPolicy: CommentPolicy = CommentPolicy(), strictSignOverpunch: Boolean = true, improvedNullDetection: Boolean = false, ebcdicCodePage: CodePage = new CodePageCommon, asciiCharset: Charset = StandardCharsets.US_ASCII, isUtf16BigEndian: Boolean = true, floatingPointFormat: FloatingPointFormat = FloatingPointFormat.IBM, nonTerminals: Seq[String] = Nil, occursHandlers: Map[String, Map[String, Int]] = Map(), debugFieldsPolicy: DebugFieldsPolicy = DebugFieldsPolicy.NoDebug): Copybook

Tokenizes a Cobol Copybook contents and returns the AST.
Tokenizes a Cobol Copybook contents and returns the AST.
copyBookContents
A string containing all lines of a copybook
dropGroupFillers
Drop groups marked as fillers from the output AST
dropValueFillers
Drop primitive fields marked as fillers from the output AST
segmentRedefines
A list of redefined fields that correspond to various segments. This needs to be specified for automatically
fieldParentMap
A segment fields parent mapping
stringTrimmingPolicy
Specifies if and how strings should be trimmed when parsed
commentPolicy
Specifies a policy for comments truncation inside a copybook
strictSignOverpunch
If true sign overpunching is not allowed for unsigned numbers
improvedNullDetection
If true, string values that contain only zero bytes (0x0) will be considered null.
ebcdicCodePage
A code page for EBCDIC encoded data
asciiCharset
A charset for ASCII encoded data
isUtf16BigEndian
If true UTF-16 strings are considered big-endian.
floatingPointFormat
A format of floating-point numbers (IBM/IEEE754)
nonTerminals
A list of non-terminals that should be extracted as strings
debugFieldsPolicy
Specifies if debugging fields need to be added and what should they contain (false, hex, raw).
returns
Seq[Group] where a group is a record inside the copybook
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
def transformIdentifier(identifier: String): String

Transforms the Cobol identifiers to be useful in Spark context.
Transforms the Cobol identifiers to be useful in Spark context. Removes characters an identifier cannot contain.
def transformIdentifierMap(identifierMap: Map[String, String]): Map[String, String]

Transforms all identifiers in a map to be useful in Spark context.
Transforms all identifiers in a map to be useful in Spark context. Removes characters an identifier cannot contain.
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Doc: package parser

object CopybookParser extends Logging

Type Members

type CopybookAST = Group

case class CopybookLine(level: Int, name: String, lineNumber: Int, modifiers: Map[String, String]) extends Product with Serializable

case class RecordBoundary(name: String, begin: Int, end: Int) extends Product with Serializable

case class StatementLine(lineNumber: Int, text: String) extends Product with Serializable

case class StatementTokens(lineNumber: Int, tokens: Array[String]) extends Product with Serializable

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

def findCycleInAMap(m: Map[String, String]): List[String]

def getAllSegmentRedefines(schema: CopybookAST): List[Group]

final def getClass(): Class[_]

def getParentToChildrenMap(schema: CopybookAST): Map[String, Seq[Group]]

def getRootSegmentAST(schema: CopybookAST): CopybookAST

def getRootSegmentIds(segmentIdRedefineMap: Map[String, String], fieldParentMap: Map[String, String]): List[String]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

def logName: String

def logger: Logger

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def parseSimple(copyBookContents: String, dropGroupFillers: Boolean = false, dropValueFillers: Boolean = true, commentPolicy: CommentPolicy = CommentPolicy(), dropFillersFromAst: Boolean = false): Copybook

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

def transformIdentifier(identifier: String): String

def transformIdentifierMap(identifierMap: Map[String, String]): Map[String, String]

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped