Does the element have a default value?
Define this for schema components that have back-references to ref objects.
Define this for schema components that have back-references to ref objects. So group def to group ref, globalelementdecl to element ref, type to element, base type to derived type.
Not for format annotations however. We don't backpoint those to other format annotations that ref them.
This QName should contain the prefix from the element reference
The lexically enclosing schema component
The lexically enclosing schema component
Irrespective of whether the type of this element is immediate or primitive, or reached via a type reference, this is the typeDef of the type.
An integer which is the alignment of this term.
An integer which is the alignment of this term. This takes into account the representation, type, charset encoding and alignment-related properties.
Anything annotated must be able to construct the appropriate DFDLAnnotation object from the xml.
Anything annotated must be able to construct the appropriate DFDLAnnotation object from the xml.
The DFDL annotations on the component, as objects that are subtypes of DFDLAnnotation.
The DFDL annotations on the component, as objects that are subtypes of DFDLAnnotation.
The DPathElementInfo objects referenced within an IVC that calls dfdl:contentLength( thingy )
The DPathElementInfo objects referenced within an IVC that calls dfdl:contentLength( thingy )
The DPathElementInfo objects referenced within an OVC that calls dfdl:contentLength( thingy )
The DPathElementInfo objects referenced within an OVC that calls dfdl:contentLength( thingy )
The DPathElementInfo objects referenced within an IVC that calls dfdl:valueLength( thingy )
The DPathElementInfo objects referenced within an IVC that calls dfdl:valueLength( thingy )
The DPathElementInfo objects referenced within an IVC that calls dfdl:valueLength( thingy )
The DPathElementInfo objects referenced within an IVC that calls dfdl:valueLength( thingy )
check for overlap.
check for overlap.
The NextElementResolver is used to determine what infoset event comes next, and "resolves" which is to say determines the ElementRuntimeData for that infoset event.
The NextElementResolver is used to determine what infoset event comes next, and "resolves" which is to say determines the ElementRuntimeData for that infoset event. This can be used to construct the initial infoset from a stream of XML events.
Set of elements referenced from an expression in the scope of this term.
Set of elements referenced from an expression in the scope of this term.
Specific to certain function call contexts e.g., only elements referenced by dfdl:valueLength or dfdl:contentLength.
Separated by parser/unparser since parsers have to derive from dfdl:inputValueCalc, and must include discriminators and assert test expressions. Unparsers must derive from dfdl:outputValueCalc and exclude discriminators and asserts. Both must include setVariable/newVariableInstance, and property expressions are nearly the same. There are some unparser-specfic properties that take runtime-valued expressions - dfdl:outputNewLine is one example.
Any element referenced from an expression in the scope of this term is in this set.
Any element referenced from an expression in the scope of this term is in this set.
We need the nil values in raw form for diagnostic messages.
We need the nil values in raw form for diagnostic messages.
We need the nil values in cooked forms of two kinds. For parsing, and for unparsing.
The difference is due to for unparsing the %NL; is treated specially because it must be computed based on dfdl:outputNewLine.
Means the infoset element could have varying length.
Means the infoset element could have varying length.
And that means if there is a specified length box to fit it into that we have to check if it is too big/small for the box.
So that means hexBinary, or representation text (for simple types) or any complex type unless everything in it is fixedLengthInfoset.
So for example, a complex type containing only fixed length binary integers is itself fixed length.
True if this term is known to have some text aspect.
True if this term is known to have some text aspect. This can be the value, or it can be delimiters.
False only if this term cannot ever have text in it. Example: a sequence with no delimiters. Example: a binary int with no delimiters.
Note: this is not recursive - it does not roll-up from children terms. TODO: it does have to deal with the prefix length situation. The type of the prefix may be textual.
Override in element base to take simple type or prefix length situations into account
Mandatory text alignment for delimiters
Mandatory text alignment for delimiters
This is the compile info for this element.
This is the compile info for this element. Since this might be an element ref, we optionally carry the compile info for the referenced element in that case.
Direct element children of a complex element.
Direct element children of a complex element.
Include both represented and non-represented elements.
For specified-length elements, computes the Ev which determines when unparsing, there is a target length in units of bits that can cause the need to insert, for simple types, padding or fillByte, or to truncate.
For specified-length elements, computes the Ev which determines when unparsing, there is a target length in units of bits that can cause the need to insert, for simple types, padding or fillByte, or to truncate. Or, for complex types, to insert ElementUnused region.
Evs enable elimination of the proliferation of dual code paths for known vs. unknown byteOrder, encoding, length, etc. Just code as if it was runtime-valued using the Ev. The "right thing" happens if the information is constant.
Here we establish an invariant which is that every annotatable schema component has, definitely, has an annotation object.
Here we establish an invariant which is that every annotatable schema component has, definitely, has an annotation object. It may have no properties on it, but it will be there. Hence, we can delegate various property-related attribute calculations to it.
To realize this, every concrete class must implement (or inherit) an implementation of emptyFormatFactory, which constructs an empty format annotation, and isMyFormatAnnotation which tests if an annotation is the corresponding kind.
Given that, formatAnnotation then either finds the right annotation, or constructs one, but our invariant is imposed. There *is* a formatAnnotation.
check length and if there are delimiters such that there is a concept of something that we can call 'empty'
check length and if there are delimiters such that there is a concept of something that we can call 'empty'
Empty is observable so long as one can have zero length followed by a separator, or zero length between an initiator and terminator (as required for empty by emptyValueDelimiterPolicy)
The enclosing component, and follows back-references from types to their elements, from globalElementDef to elementRefs, from simpleType defs to derived simpletype defs, from global group defs to group refs
The enclosing component, and follows back-references from types to their elements, from globalElementDef to elementRefs, from simpleType defs to derived simpletype defs, from global group defs to group refs
Note: the enclosing component of a global element or global group referenced from a element ref or group ref, is NOT the ref object, but the component that contains the ref object
All schema components except the root have an enclosing element.
All schema components except the root have an enclosing element.
Does lookup of property using DFDL scoping rules, checking first non-default properties, then default property locations.
Does lookup of property using DFDL scoping rules, checking first non-default properties, then default property locations.
Use this when you want to know if a property is defined exactly on a component.
Use this when you want to know if a property is defined exactly on a component. This ignores any default properties or properties defined on element references. For example, if you want to know if a property was defined on a global element decl rather than an element reference to that decl.
For unit testing, we want to create GrammarMixin objects that are not schema components.
For unit testing, we want to create GrammarMixin objects that are not schema components. So we can't use a self-type here. Instead we define this abstract grammarContext.
True if this term has initiator, terminator, or separator that are either statically present, or there is an expression.
True if this term has initiator, terminator, or separator that are either statically present, or there is an expression. (Such expressions are not allowed to evaluate to "" - you can't turn off a delimiter by providing "" at runtime. Minimum length is 1 for these at runtime.
Override in SequenceTermBase to also check for separator.
Does this node have statically required instances.
Does this node have statically required instances.
no alignment properties that would explicitly create a need to align in a way that is not on a suitable boundary for a character.
no alignment properties that would explicitly create a need to align in a way that is not on a suitable boundary for a character.
We want to determine if we're in an unordered sequence at any point along our parents.
We want to determine if we're in an unordered sequence at any point along our parents.
True if the length of the SimpleContent region or the ComplexContent region (see DFDL Spec section 9.2) is known to be greater than zero.
True if the length of the SimpleContent region or the ComplexContent region (see DFDL Spec section 9.2) is known to be greater than zero.
These content grammar regions are orthogonal to both nillable representations, and empty representations, and to all aspects of framing - alignment, skip, delimiters etc.
We require that there be a concept of empty if we're going to be able to default something and we are going to require that we can tell this statically.
We require that there be a concept of empty if we're going to be able to default something and we are going to require that we can tell this statically. I.e., we're not going to defer this to runtime just in case the delimiters are being determined at runtime.
That is to say, if a delimiter is an expression, then we're assuming that means at runtime it will not evaluate to empty string (so you can specify the delimiter at runtime, but you cannot turn on/off the whole delimited format at runtime.)
true if padding will be inserted for this delimited element when unparsing.
true if padding will be inserted for this delimited element when unparsing.
Tells us if we have a specific length.
Tells us if we have a specific length.
Keep in mind that 80 characters in length can be anywhere from 80 to 320 bytes depending on the character encoding. So fixed length doesn't mean in bytes. it means in dfdl:lengthUnits units, which could be characters, and those can be fixed or variable width.
Whether the component is hidden.
Whether the component is hidden.
Override this in the components that can hide - SequenceGroupRef and ChoiceGroupRef
Character encoding common attributes
Character encoding common attributes
Note that since encoding can be computed at runtime, we create values to tell us if the encoding is known or not so that we can decide things at compile time when possible.
Conservatively determines if this term is known to have the same bit order as the previous thing.
Conservatively determines if this term is known to have the same bit order as the previous thing.
If uncertain, returns false.
true if we can statically determine that the start of this will be properly aligned by where the prior thing left us positioned.
true if we can statically determine that the start of this will be properly aligned by where the prior thing left us positioned. Hence we are guaranteed to be properly aligned.
True if this element itself consists only of text.
True if this element itself consists only of text. No binary stuff like alignment or skips.
Not recursive into contained children.
Tells us if, for this element, we need to capture its content length at parse runtime, or we can ignore that.
Tells us if, for this element, we need to capture its content length at unparse runtime, or we can ignore that.
Tells us if, for this element, we need to capture its value length at parse runtime, or we can ignore that.
Tells us if, for this element, we need to capture its value length at unparse runtime, or we can ignore that.
Overridden as false for elements with dfdl:inputValueCalc property.
Overridden as false for elements with dfdl:inputValueCalc property.
True if it is sensible to scan this data e.g., with a regular expression.
True if it is sensible to scan this data e.g., with a regular expression. Requires that all children have same encoding as enclosing groups and elements, requires that there is no leading or trailing alignment regions, skips. We have to be able to determine that we are for sure going to always be properly aligned for text.
Caveat: we only care that the encoding is the same if the term actually could have text (couldHaveText is an LV) as part of its representation. For example, a sequence with no initiator, terminator, nor separators can have any encoding at all, without disqualifying an element containing it from being scannable. There has to be text that would be part of the scan.
If the root element isScannable, and encodingErrorPolicy is 'replace', then we can use a lower-overhead I/O layer - basically we can use a java.io.InputStreamReader directly.
We are going to depend on the fact that if the encoding is going to be this X-DFDL-US-ASCII-7-BIT-PACKED thingy (7-bits wide code units, so aligned at 1 bit) that this encoding must be specified statically in the schema.
If an encoding is determined at runtime, then we will insist on it being 8-bit aligned code units.
Fixed length, or variable length with explicit length expression.
Fixed length, or variable length with explicit length expression.
Only strings can be truncated, only if they are specified length, and only if truncateSpecifiedLengthString is 'yes'.
Only strings can be truncated, only if they are specified length, and only if truncateSpecifiedLengthString is 'yes'.
Note that specified length might mean fixed length or variable (but specified) length.
parsingPadChar is the pad character for parsing unparsingPadChar is the pad character for unparsing These are always carried as MaybeChar.
parsingPadChar is the pad character for parsing unparsingPadChar is the pad character for unparsing These are always carried as MaybeChar.
We need both, because in the same schema you can have textPadKind="padChar" but textTrimKind="none", so there can't be just one pad char object if it is to carry information about both whether or not a pad character is to be used, and the value.
When the encoding is known, this tells us the mandatory alignment required.
When the encoding is known, this tells us the mandatory alignment required. This is always 1 or 8.
Annotations can contain expressions, so we need to be able to compile them.
Annotations can contain expressions, so we need to be able to compile them.
We need our own instance so that the expression compiler has this schema component as its context.
Compute minLength and maxLength together to share error-checking and case dispatch that would otherwise have to be repeated.
Use when we might or might not need the outputNewLine property
Use when we might or might not need the outputNewLine property
Used for padding, which might be for specified-length, or might be for delimited.
Used for padding, which might be for specified-length, or might be for delimited.
Compute minLength and maxLength together to share error-checking and case dispatch that would otherwise have to be repeated.
Mandatory text alignment or mta
Mandatory text alignment or mta
mta can only apply to things with encodings. No encoding, no MTA.
In addition, it has to be textual data. Just because there's an encoding in the property environment shouldn't get you an MTA region. It has to be textual.
Means the specified length must, necessarily, be big enough to hold the representation so long as the value in the infoset is legal for the type.
Means the specified length must, necessarily, be big enough to hold the representation so long as the value in the infoset is legal for the type.
This does not include numeric range checking. So for example if you have an xs:unsignedInt but length is 3 bits, this will be true even though an integer value of greater than 7 cannot fit.
Another way to think of this is that legal infoset values will have fixed length representations.
This is a conservative analysis, meaning if true the property definitely holds, but if false it may mean we just couldn't tell if it holds or not.
If this is true, then we never need to check how many bits were written when unparsing, because we know a legal value has to fit. If the value is illegal then we'll get an unparse error anyway.
If this is false, then it's possible that the value, even a legal value, might not fit if the length is specified. We're unable to prove that all legal values WILL fit.
A critical case is that fixed length binary integers should always return true here so that we're not doing excess length checks on them Or computing their value length unnecessarily.
Namespace scope for resolving QNames.
Namespace scope for resolving QNames.
We insist that the prefix "xsi" is properly defined for use in xsi:nil attributes, which is how we represent nilled elements when we convert to XML.
nearestEnclosingSequence
nearestEnclosingSequence
An attribute that looks upward to the surrounding context of the schema, and not just lexically surrounding context. It needs to see what declarations will physically surround the place. This is the dynamic scope, not just the lexical scope. So, a named global type still has to be able to ask what sequence is surrounding the element that references the global type.
This is why we have to have the GlobalXYZDefFactory stuff. Because this kind of back pointer (contextual sensitivity) prevents sharing.
Used as factory for the XML Node with the right namespace and prefix etc.
Used as factory for the XML Node with the right namespace and prefix etc.
Given "element" it creates <dfdl:element /> with the namespace definitions based on this schema component's corresponding XSD construct.
Makes sure to inherit the scope so we have all the namespace bindings.
None for complex types, Some(primType) for simple types.
Combine our statements with those of what we reference.
Combine our statements with those of what we reference. Elements reference types ElementRefs reference elements, etc.
The order here is important. The statements from type come first, then from declaration, then from reference.
Changed to use findProperty, and to resolve the namespace properly.
Changed to use findProperty, and to resolve the namespace properly.
We lookup a property like escapeSchemeRef, and that actual property binding can be local, in scope, by way of a format reference, etc.
It's value is a QName, and the definition of the prefix is from the location where we found the property, and NOT where we consume the property.
Hence, we resolve w.r.t. the location that provided the property.
The point of findProperty vs. getProperty is just that the former returns both the value, and the object that contained it. That object is what we resolve QNames with respect to.
Note: Same is needed for properties that have expressions as their values. E.g., consider "{ ../foo:bar/.. }". That foo prefix must be resolved relative to the object where this property was written, not where it is evaluated. (JIRA issue DFDL-77)
parsingPadChar is the pad character for parsing unparsingPadChar is the pad character for unparsing These are always carried as MaybeChar.
parsingPadChar is the pad character for parsing unparsingPadChar is the pad character for unparsing These are always carried as MaybeChar.
We need both, because in the same schema you can have textPadKind="padChar" but textTrimKind="none", so there can't be just one pad char object if it is to carry information about both whether or not a pad character is to be used, and the value.
path is used in diagnostic messages and code debug messages; hence, it is very important that it be very dependable.
path is used in diagnostic messages and code debug messages; hence, it is very important that it be very dependable.
Returns a tuple, where the first item in the tuple is the list of sibling terms that could appear before this.
Returns a tuple, where the first item in the tuple is the list of sibling terms that could appear before this. The second item in the tuple is a One(parent) if all siblings are optional or this element has no prior siblings
Use when production has no guard, but you want to name the production anyway (for debug visibility perhaps).
Use when production has no guard, but you want to name the production anyway (for debug visibility perhaps).
Use when production has a guard predicate
Use when production has a guard predicate
Convenience method to make gathering up all elements referenced in expressions easier.
Convenience method to make gathering up all elements referenced in expressions easier.
For property combining only.
For property combining only. E.g., doesn't refer from an element to its complex type because we don't combine properties with that in DFDL v1.0. (I consider that a language design bug in DFDL v1.0, but that is the way it's defined.)
Returns the property resolver for this component.
Returns the property resolver for this component.
ALl non-terms get runtimeData from this definition.
ALl non-terms get runtimeData from this definition. All Terms which are elements and model-groups) override this.
The Term class has a generic termRuntimeData => TermRuntimeData function (useful since all Terms share things like having charset encoding) The Element classes all inherit an elementRuntimeData => ElementRuntimeData and the model groups all have modelGroupRuntimeData => ModelGroupRuntimeData.
There is also VariableRuntimeData and SchemaSetRuntimeData.
Includes instances.
Includes instances. Ie., a global element will appear inside an element ref. a global group inside a group ref, a global type inside an element or for derived simple types inside another simple type, etc.
Used in diagnostic messages and code debug messages
separator combinators - detect cases where no separator applies.
separator combinators - detect cases where no separator applies. Note that repeating elements are excluded because they have to managed their own separatedForArrayPosition inside the repetition.
We add fill to complex types of specified length so long as length units are bytes/bits.
We add fill to complex types of specified length so long as length units are bytes/bits. If characters then "pad" puts the characters on.
We also add it to text elements of variable specified length again, unless length units are in characters.
Quite tricky when we add padding or fill
Quite tricky when we add padding or fill
For complex types, there has to be a length defined. That's it.
For simple types, we need the fill region processors to detect excess length
Check for excess length if it's variable length and we cannot truncate it.
Check for excess length if it's variable length and we cannot truncate it.
Elements only e.g., /foo/ex:bar
Elements only e.g., /foo/ex:bar
Roll up from the bottom.
Roll up from the bottom. This is abstract interpretation. The top (aka conflicting encodings) is "mixed" The bottom is "noText" (combines with anything) The values are encoding names, or "runtime" for expressions.
By doing expression analysis we could do a better job here and determine when things that use expressions to get the encoding are all going to get the same expression value. For now, if it is an expression then we lose.
This is the root, or basic target namespace.
This is the root, or basic target namespace. Every schema component gets its target namespace from its xmlSchemaDocument.
Returns the Term corresponding to this component.
Returns the Term corresponding to this component.
The termChildren are the children that are Terms, i.e., derived from the Term base class.
The termChildren are the children that are Terms, i.e., derived from the Term base class. This is to make it clear we're not talking about the XML structures inside the XML parent (which might include annotations, etc.
For elements this is Nil for simple types, a single model group for complex types. For model groups there can be more children.
Used in diagnostic messages; hence, valueOrElse to avoid problems when this can't get a value due to an error.
Used in diagnostic messages; hence, valueOrElse to avoid problems when this can't get a value due to an error.
Any element referenced from an expression in the scope of this term is in this set.
Any element referenced from an expression in the scope of this term is in this set.
Any element referenced from an expression in the scope of this term is in this set.
Any element referenced from an expression in the scope of this term is in this set.
Mandatory text alignment or mta
Mandatory text alignment or mta
mta can only apply to things with encodings. No encoding, no MTA.
In addition, it has to be textual data. Just because there's an encoding in the property environment shouldn't get you an MTA region. It has to be textual.
Shared by all forms of elements, local or global or element reference.