com.fasterxml.aalto.async
Class AsyncByteScanner

java.lang.Object
  extended by com.fasterxml.aalto.in.XmlScanner
      extended by com.fasterxml.aalto.in.ByteBasedScanner
          extended by com.fasterxml.aalto.async.AsyncByteScanner
All Implemented Interfaces:
XmlConsts, NamespaceContext, XMLStreamConstants
Direct Known Subclasses:
AsyncUtfScanner

public abstract class AsyncByteScanner
extends ByteBasedScanner

This is the base class for asynchronous (non-blocking) xml scanners. Due to basic complexity of async approach, character-based doesn't make much sense, so only byte-based input is supported.


Field Summary
protected  int _currQuad
          Bytes parsed for the current, incomplete, quad
protected  int _currQuadBytes
          Number of bytes pending/buffered, stored in _currQuad
protected  byte[] _inputBuffer
          This buffer is actually provided by caller
static int EVENT_INCOMPLETE
          As per javadocs of XMLStreamConstants, event codes 0 through 256 (inclusive?) are reserved by the Stax specs, so we'll use the next available code.
protected  boolean mElemAllNsBound
           
protected  boolean mElemAttrCount
           
protected  PName mElemAttrName
           
protected  int mElemAttrPtr
           
protected  byte mElemAttrQuote
           
protected  int mNextEvent
          Due to asynchronous nature of parsing, we may know what event we are trying to parse, even if it's not yet complete.
protected  int mOrigBufferLen
          In addition to current buffer pointer, and end pointer, we will also need to know number of bytes originally contained.
protected  int mPendingInput
          There are some multi-byte combinations that must be handled as a unit: CR+LF linefeeds, multi-byte UTF-8 characters, and multi-character end markers for comments and PIs.
protected  int mQuadCount
          Number of complete quads parsed for current name (quads themselves are stored in ByteBasedScanner.mQuadBuffer).
protected  int mState
          In addition to the event type, there is need for additional state information
protected  int mSurroundingEvent
          For token/state combinations that are 'shared' between events (or embedded in them), this is where the surrounding event state is retained.
 
Fields inherited from class com.fasterxml.aalto.in.ByteBasedScanner
_inputEnd, _inputPtr, BYTE_a, BYTE_A, BYTE_AMP, BYTE_APOS, BYTE_C, BYTE_CR, BYTE_D, BYTE_EQ, BYTE_EXCL, BYTE_g, BYTE_GT, BYTE_HASH, BYTE_HYPHEN, BYTE_l, BYTE_LBRACKET, BYTE_LF, BYTE_LT, BYTE_m, BYTE_NULL, BYTE_o, BYTE_p, BYTE_P, BYTE_q, BYTE_QMARK, BYTE_QUOT, BYTE_RBRACKET, BYTE_s, BYTE_S, BYTE_SEMICOLON, BYTE_SLASH, BYTE_SPACE, BYTE_t, BYTE_T, BYTE_TAB, BYTE_u, BYTE_x, mCharTypes, mPastBytes, mQuadBuffer, mRowStartOffset, mSymbols, mTmpChar
 
Fields inherited from class com.fasterxml.aalto.in.XmlScanner
_attrCollector, _attrCount, _cfgCoalescing, _cfgLazyParsing, _config, _currElem, _currNsCount, _currRow, _currToken, _defaultNs, _depth, _entityPending, _isEmptyTag, _lastNsContext, _lastNsDecl, _nameBuffer, _publicId, _systemId, _textBuilder, _tokenIncomplete, _tokenName, _xml11, CDATA_STR, INT_A, INT_AMP, INT_APOS, INT_COLON, INT_CR, INT_EQ, INT_EXCL, INT_GT, INT_HYPHEN, INT_LBRACKET, INT_LF, INT_LT, INT_NULL, INT_QMARK, INT_QUOTE, INT_RBRACKET, INT_SLASH, INT_SPACE, INT_TAB, MAX_UNICODE_CHAR, TOKEN_EOI
 
Fields inherited from interface com.fasterxml.aalto.XmlConsts
CHAR_CR, CHAR_LF, CHAR_NULL, CHAR_SPACE, STAX_DEFAULT_OUTPUT_ENCODING, STAX_DEFAULT_OUTPUT_VERSION, XML_DECL_KW_ENCODING, XML_DECL_KW_STANDALONE, XML_DECL_KW_VERSION, XML_SA_NO, XML_SA_YES, XML_V_10, XML_V_10_STR, XML_V_11, XML_V_11_STR, XML_V_UNKNOWN
 
Fields inherited from interface javax.xml.stream.XMLStreamConstants
ATTRIBUTE, CDATA, CHARACTERS, COMMENT, DTD, END_DOCUMENT, END_ELEMENT, ENTITY_DECLARATION, ENTITY_REFERENCE, NAMESPACE, NOTATION_DECLARATION, PROCESSING_INSTRUCTION, SPACE, START_DOCUMENT, START_ELEMENT
 
Constructor Summary
AsyncByteScanner(ReaderConfig cfg)
           
 
Method Summary
protected  void _closeSource()
          Since the async scanner has no access to whatever passes content, there is no input source in same sense as with blocking scanner; and there is nothing to close.
 void addInput(byte[] buf, int start, int len)
           
protected abstract  PName addPName(int hash, int[] quads, int qlen, int lastQuadBytes)
           
protected  int decodeCharForError(byte b)
          Method called by methods when encountering a byte that can not be part of a valid character in the current context.
protected  void finishCData()
           
protected abstract  void finishCharacters()
           
protected abstract  int finishCharactersCoalescing()
           
protected  void finishComment()
           
protected  void finishDTD(boolean copyContents)
           
protected  void finishPI()
           
protected  void finishSpace()
           
protected abstract  boolean handleAttrValue()
           
protected  int handleCharacterEntity()
           
protected  int handleEntity()
           
protected  int handleEntityStart(int surroundingEvent, byte b)
           
protected  int handleGeneralEntity(PName entityName)
           
protected abstract  boolean handleNsDecl()
           
protected  boolean handlePartialCR()
          Method called when there is a pending \r (from past buffer), and we need to see
protected  int handleStartElement()
           
protected  int handleStartElementStart(byte b)
          Method called when '<' and (what appears to be) a name start character have been seen.
 boolean hasInput()
           
protected  boolean loadMore()
           
 int nextFromProlog(boolean isProlog)
           
 int nextFromTree()
           
protected abstract  int parseCommentContents()
           
protected  PName parseEntityName()
           
protected  PName parseNewEntityName(byte b)
           
protected  PName parseNewName(byte b)
           
protected abstract  int parsePIData()
           
protected  PName parsePName()
          This method can (for now?) be shared between all Ascii-based encodings, since it only does coarse validity checking -- real checks are done in different method.
protected  void skipCData()
           
protected abstract  boolean skipCharacters()
           
protected  void skipComment()
           
protected  void skipPI()
           
protected  void skipSpace()
           
protected abstract  int startCharacters(byte b)
          Method called to initialize state for CHARACTERS event, after just a single byte has been seen.
protected abstract  int startCharactersPending()
          This method gets called, if the first character of a CHARACTERS event could not be fully read (multi-byte, split over buffer boundary).
protected  int throwInternal()
           
 String toString()
           
 
Methods inherited from class com.fasterxml.aalto.in.ByteBasedScanner
_releaseBuffers, addUtfPName, getCurrentColumnNr, getCurrentLineNr, getCurrentLocation, markLF, markLF, reportInvalidInitial, reportInvalidOther
 
Methods inherited from class com.fasterxml.aalto.in.XmlScanner
bindName, bindNs, checkImmutableBinding, close, decodeAttrBinaryValue, decodeAttrValue, decodeAttrValues, findAttrIndex, findOrCreateBinding, finishToken, fireSaxCharacterEvents, fireSaxCommentEvent, fireSaxEndElement, fireSaxPIEvent, fireSaxSpaceEvents, fireSaxStartElement, getAttrCollector, getAttrCount, getAttrLocalName, getAttrNsURI, getAttrPrefix, getAttrPrefixedName, getAttrQName, getAttrType, getAttrValue, getAttrValue, getConfig, getDepth, getDtdPublicId, getDtdSystemId, getEndLocation, getInputPublicId, getInputSystemId, getName, getNamespacePrefix, getNamespaceURI, getNamespaceURI, getNamespaceURI, getNonTransientNamespaceContext, getNsCount, getPrefix, getPrefixes, getQName, getStartLocation, getText, getText, getTextCharacters, getTextCharacters, getTextLength, hasEmptyStack, isAttrSpecified, isEmptyTag, isTextWhitespace, loadMoreGuaranteed, loadMoreGuaranteed, reportDoubleHyphenInComments, reportDuplicateNsDecl, reportEntityOverflow, reportEofInName, reportIllegalCDataEnd, reportIllegalNsDecl, reportIllegalNsDecl, reportInputProblem, reportInvalidNameChar, reportInvalidNsIndex, reportInvalidXmlChar, reportMissingPISpace, reportMultipleColonsInName, reportPrologUnexpChar, reportTreeUnexpChar, reportUnboundPrefix, reportUnexpandedEntityInAttr, reportUnexpectedEndTag, skipCoalescedText, skipToken, throwInvalidSpace, throwInvalidXmlChar, throwNullChar, throwUnexpectedChar
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

EVENT_INCOMPLETE

public static final int EVENT_INCOMPLETE
As per javadocs of XMLStreamConstants, event codes 0 through 256 (inclusive?) are reserved by the Stax specs, so we'll use the next available code.

See Also:
Constant Field Values

_inputBuffer

protected byte[] _inputBuffer
This buffer is actually provided by caller


mOrigBufferLen

protected int mOrigBufferLen
In addition to current buffer pointer, and end pointer, we will also need to know number of bytes originally contained. This is needed to correctly update location information when the block has been completed.


mNextEvent

protected int mNextEvent
Due to asynchronous nature of parsing, we may know what event we are trying to parse, even if it's not yet complete. Type of that event is stored here.


mState

protected int mState
In addition to the event type, there is need for additional state information


mSurroundingEvent

protected int mSurroundingEvent
For token/state combinations that are 'shared' between events (or embedded in them), this is where the surrounding event state is retained.


mPendingInput

protected int mPendingInput
There are some multi-byte combinations that must be handled as a unit: CR+LF linefeeds, multi-byte UTF-8 characters, and multi-character end markers for comments and PIs. Since they can be split across input buffer boundaries, first byte(s) may need to be temporarily stored.

If so, this int will store byte(s), in little-endian format (that is, first pending byte is at 0x000000FF, second [if any] at 0x0000FF00, and third at 0x00FF0000). This can be (and is) used to figure out actual number of bytes pending, for multi-byte (UTF-8) character decoding.

Note: it is assumed that if value is 0, there is no data. Thus, if 0 needed to be added pending, it has to be masked.


mQuadCount

protected int mQuadCount
Number of complete quads parsed for current name (quads themselves are stored in ByteBasedScanner.mQuadBuffer).


_currQuad

protected int _currQuad
Bytes parsed for the current, incomplete, quad


_currQuadBytes

protected int _currQuadBytes
Number of bytes pending/buffered, stored in _currQuad


mElemAllNsBound

protected boolean mElemAllNsBound

mElemAttrCount

protected boolean mElemAttrCount

mElemAttrQuote

protected byte mElemAttrQuote

mElemAttrName

protected PName mElemAttrName

mElemAttrPtr

protected int mElemAttrPtr
Constructor Detail

AsyncByteScanner

public AsyncByteScanner(ReaderConfig cfg)
Method Detail

toString

public String toString()
Overrides:
toString in class Object

hasInput

public final boolean hasInput()

addInput

public void addInput(byte[] buf,
                     int start,
                     int len)
              throws XMLStreamException
Throws:
XMLStreamException

_closeSource

protected void _closeSource()
                     throws IOException
Since the async scanner has no access to whatever passes content, there is no input source in same sense as with blocking scanner; and there is nothing to close.

Specified by:
_closeSource in class ByteBasedScanner
Throws:
IOException

nextFromProlog

public final int nextFromProlog(boolean isProlog)
                         throws XMLStreamException
Specified by:
nextFromProlog in class XmlScanner
Throws:
XMLStreamException

nextFromTree

public int nextFromTree()
                 throws XMLStreamException
Specified by:
nextFromTree in class XmlScanner
Throws:
XMLStreamException

parseCommentContents

protected abstract int parseCommentContents()
                                     throws XMLStreamException
Throws:
XMLStreamException

parsePIData

protected abstract int parsePIData()
                            throws XMLStreamException
Throws:
XMLStreamException

startCharacters

protected abstract int startCharacters(byte b)
                                throws XMLStreamException
Method called to initialize state for CHARACTERS event, after just a single byte has been seen. What needs to be done next depends on whether coalescing mode is set or not: if it is not set, just a single character needs to be decoded, after which current event will be incomplete, but defined as CHARACTERS. In coalescing mode, the whole content must be read before current event can be defined. The reason for difference is that when XMLStreamReader.next() returns, no blocking can occur when calling other methods.

Returns:
Event type detected; either CHARACTERS, if at least one full character was decoded (and can be returned), EVENT_INCOMPLETE if not (part of a multi-byte character split across input buffer boundary)
Throws:
XMLStreamException

startCharactersPending

protected abstract int startCharactersPending()
                                       throws XMLStreamException
This method gets called, if the first character of a CHARACTERS event could not be fully read (multi-byte, split over buffer boundary). If so, there is some pending data to be handled.

Throws:
XMLStreamException

finishCharactersCoalescing

protected abstract int finishCharactersCoalescing()
                                           throws XMLStreamException
Throws:
XMLStreamException

handleEntityStart

protected int handleEntityStart(int surroundingEvent,
                                byte b)
                         throws XMLStreamException
Parameters:
surroundingEvent - Context (next event at the time ampersand was encountered) in which entity is found. Will often be the next event set after entity is resolve.
Throws:
XMLStreamException

handleEntity

protected int handleEntity()
                    throws XMLStreamException
Throws:
XMLStreamException

handleGeneralEntity

protected final int handleGeneralEntity(PName entityName)

handleStartElementStart

protected int handleStartElementStart(byte b)
                               throws XMLStreamException
Method called when '<' and (what appears to be) a name start character have been seen.

Throws:
XMLStreamException

handleStartElement

protected int handleStartElement()
                          throws XMLStreamException
Throws:
XMLStreamException

handleAttrValue

protected abstract boolean handleAttrValue()
                                    throws XMLStreamException
Throws:
XMLStreamException

handleNsDecl

protected abstract boolean handleNsDecl()
                                 throws XMLStreamException
Throws:
XMLStreamException

finishCharacters

protected abstract void finishCharacters()
                                  throws XMLStreamException
Specified by:
finishCharacters in class XmlScanner
Throws:
XMLStreamException

finishCData

protected void finishCData()
                    throws XMLStreamException
Specified by:
finishCData in class XmlScanner
Throws:
XMLStreamException

finishComment

protected void finishComment()
                      throws XMLStreamException
Specified by:
finishComment in class XmlScanner
Throws:
XMLStreamException

finishDTD

protected void finishDTD(boolean copyContents)
                  throws XMLStreamException
Specified by:
finishDTD in class XmlScanner
Throws:
XMLStreamException

finishPI

protected void finishPI()
                 throws XMLStreamException
Specified by:
finishPI in class XmlScanner
Throws:
XMLStreamException

finishSpace

protected void finishSpace()
                    throws XMLStreamException
Specified by:
finishSpace in class XmlScanner
Throws:
XMLStreamException

skipCharacters

protected abstract boolean skipCharacters()
                                   throws XMLStreamException
Specified by:
skipCharacters in class XmlScanner
Returns:
True, if an unexpanded entity was encountered (and is now pending)
Throws:
XMLStreamException

skipCData

protected void skipCData()
                  throws XMLStreamException
Specified by:
skipCData in class XmlScanner
Throws:
XMLStreamException

skipComment

protected void skipComment()
                    throws XMLStreamException
Specified by:
skipComment in class XmlScanner
Throws:
XMLStreamException

skipPI

protected void skipPI()
               throws XMLStreamException
Specified by:
skipPI in class XmlScanner
Throws:
XMLStreamException

skipSpace

protected void skipSpace()
                  throws XMLStreamException
Specified by:
skipSpace in class XmlScanner
Throws:
XMLStreamException

loadMore

protected boolean loadMore()
                    throws XMLStreamException
Specified by:
loadMore in class XmlScanner
Throws:
XMLStreamException

parseNewName

protected PName parseNewName(byte b)
                      throws XMLStreamException
Throws:
XMLStreamException

parseNewEntityName

protected PName parseNewEntityName(byte b)
                            throws XMLStreamException
Throws:
XMLStreamException

parsePName

protected PName parsePName()
                    throws XMLStreamException
This method can (for now?) be shared between all Ascii-based encodings, since it only does coarse validity checking -- real checks are done in different method.

Some notes about assumption implementation makes:

Throws:
XMLStreamException

parseEntityName

protected PName parseEntityName()
                         throws XMLStreamException
Throws:
XMLStreamException

addPName

protected abstract PName addPName(int hash,
                                  int[] quads,
                                  int qlen,
                                  int lastQuadBytes)
                           throws XMLStreamException
Specified by:
addPName in class ByteBasedScanner
Throws:
XMLStreamException

decodeCharForError

protected int decodeCharForError(byte b)
                          throws XMLStreamException
Description copied from class: ByteBasedScanner
Method called by methods when encountering a byte that can not be part of a valid character in the current context. Should return the actual decoded character for error reporting purposes.

Specified by:
decodeCharForError in class ByteBasedScanner
Throws:
XMLStreamException

handleCharacterEntity

protected int handleCharacterEntity()
                             throws XMLStreamException
Throws:
XMLStreamException

handlePartialCR

protected final boolean handlePartialCR()
Method called when there is a pending \r (from past buffer), and we need to see

Returns:
True if the linefeed was succesfully processed (had enough input data to do that); or false if there is no data available to check this

throwInternal

protected int throwInternal()


Copyright © 2010 Fasterxml.com. All Rights Reserved.