DataInputStream

This trait defines the low level API called by Daffodil's Parsers.

It has features to support

backtracking
regex pattern matching using Java Pattern regexs (for lengthKind pattern and pattern asserts)
character-by-character access as needed by our DFA delimiter/escaping
very efficient access to small binary data (64-bits or smaller)
alignment and skipping
encodingErrorPolicy 'error' and 'replace'
convenient use of zero-based values because java/scala APIs for I/O are all zero-based
convenient use of 1-based values because DFDL is 1-based, so debug/trace and such all want to be 1-based values.

A goal is that this API does not allocate objects as I/O operations are performed unless boxed objects are being returned. For example getSignedLong(...) should not allocate anything per call; however, getSignedBigInt(...) does, because a BigInt is a heap-allocated object.

Internal buffers and such may be dropped/resized/reallocated as needed during method calls. The point is not that you never allocate. It's that the per-I/O operation overhead does not require object allocation for every data-accessing method call.

Similarly, text data can be retrieved into a char buffer, and the char buffer can provide a limit on size (available capacity of the char buffer) in characters. The text can be examined in the char buffer, or a string can be created from the char buffer's contents when needed.

This API is very stateful, and not-thread-safe i.e., each thread must have its own object. Some of this is inherent in this API style, and some is inherited from the underlying objects this API uses (such as CharsetDecoder).

This API is also intended to support some very highly optimized implementations. For example, if the schemas is all text, and the encoding is known to be iso-8859-1, then there is no notion of a decode error, and every byte value, extended to a Char value, *is* the Unicode codepoint. No decoder needs to be used in this case and this API becomes a quite thin layer on top of a java.io.BufferedStream.

Terminology:

Available Data - this is the data that is between the current bit position, and some limit. The limit can either be set (via setBitLimit calls), or it can be limited by tunable values, or implementation-specific upper limits, or it can simply be the end of the data stream.

Different kinds of DataInputStreams can have different limits. For example, a File-based DataInputStream may have no limit on the forward speculation distance, because the file can be randomly accessed if necessary. Contrasting that with a data stream that is directly connected to a network socket may have a upper limit on the amount of data that it is willing to buffer.

None of this is a commitment that this API will in fact have multiple specialized implementations. It's just a possibility for the future.

Implementation Note: It is the implementation of this interface which implements the Bucket Algorithm as described on the Daffodil Wiki. All of that bucket stuff is beneath this API.

In general, this API tries to return a value rather than throw exceptions whenever the behavior may be very common. This leaves it up to the caller to decide whether or not to throw an exception, and avoids the overhead of try-catch blocks. The exception to this rule are the methods that involve character decoding for textual data. These methods may throw CharacterCodingException when the encoding error policy is 'error'.

Linear Supertypes

AnyRef, Any

Type Members

trait CharIterator extends Iterator[Char]

An Iterator[Char] with additional peek and peek2 methods.
trait Mark extends Poolable

Backtracking
Backtracking
The mark and reset system is more sophisticated than that of java's BufferedInputStream, which allows only a single outstanding mark.
This trait enables a stack of mark values to be created and reset, respecting stack ordering, that is, they are nested locations and must be created and released in an order consistent with stack ordering.
The mark contains additional state beyond just the position (which is maintained at bit granularity). All the other state aspects (decoder, bit order, etc.) are also maintained by a mark.
Implementation note: The oldest/deepest mark is expected to be built on top of a java BufferedInputStream mark.
Use of mark/reset should eliminate any need for random-access setters of the bit position.
type MarkPos = Long

For mini-marks that just mark/reset the position
case class NotEnoughDataException(nBits: Long) extends Exception with ThinThrowable with Product with Serializable

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
object MarkPos
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Docs: trait DataInputStream | package io

object DataInputStream

Type Members

trait CharIterator extends Iterator[Char]

trait Mark extends Poolable

type MarkPos = Long

case class NotEnoughDataException(nBits: Long) extends Exception with ThinThrowable with Product with Serializable

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

object MarkPos

final def asInstanceOf[T0]: T0

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from AnyRef

Inherited from Any

Ungrouped