Valid parser states.
Valid parser states.
Access a byte range as a string.
Access a byte range as a string.
Since the underlying data are UTF-8 encoded, i and k must occur on unicode boundaries. Also, the resulting String is not guaranteed to have length (k - i).
Read the byte/char at 'i' as a Char.
Read the byte/char at 'i' as a Char.
Note that this should not be used on potential multi-byte sequences.
Return true iff 'i' is at or beyond the end of the input (EOF).
Return true iff 'i' is at or beyond the end of the input (EOF).
This is a specialized accessor for the case where our underlying data are bytes not chars.
This is a specialized accessor for the case where our underlying data are bytes not chars.
We use this to keep track of the last recoverable place we've seen.
We use this to keep track of the last recoverable place we've seen. If we hit an AsyncException, we can later resume from this point.
This method is called during every loop of rparse, and the arguments are the exact arguments we can pass to rparse to continue where we left off.
Should be called when parsing is finished.
Should be called when parsing is finished.
Generate a Char from the hex digits of "ሴ" (i.
Generate a Char from the hex digits of "ሴ" (i.e. "1234").
NOTE: This is only capable of generating characters from the basic plane. This is why it can only return Char instead of Int.
Used to generate error messages with character info and offsets.
Used to generate error messages with character info and offsets.
Used to generate messages for internal errors.
Used to generate messages for internal errors.
This should only be used in situations where a possible bug in the parser was detected. For errors in user-provided JSON, use die().
Return true iff the bytes/chars from 'i' until 'j' are equal to 'str'.
Return true iff the bytes/chars from 'i' until 'j' are equal to 'str'.
Return true iff the byte/char at 'i' is equal to 'c'.
Return true iff the byte/char at 'i' is equal to 'c'.
Parse and return the next JSON value and the position beyond it.
Parse and return the next JSON value and the position beyond it.
Parse the JSON constant "false".
Parse the JSON constant "false".
Parse the JSON constant "null".
Parse the JSON constant "null".
Parse the given number, and add it to the given context.
Parse the given number, and add it to the given context.
We don't actually instantiate a number here, but rather pass the string of for future use. Facades can choose to be lazy and just store the string. This ends up being way faster and has the nice side-effect that we know exactly how the user represented the number.
Parse the given number, and add it to the given context.
Parse the given number, and add it to the given context.
This method is a bit slower than parseNum() because it has to be sure it doesn't run off the end of the input.
Normally (when operating in rparse in the context of an outer array or object) we don't need to worry about this and can just grab characters, because if we run out of characters that would indicate bad input. This is for cases where the number could possibly be followed by a valid EOF.
This method has all the same caveats as the previous method.
Parse the string according to JSON rules, and add to the given context.
Parse the string according to JSON rules, and add to the given context.
This method expects the data to be in UTF-8 and accesses it as bytes.
See if the string has any escape sequences.
See if the string has any escape sequences. If not, return the end of the string. If so, bail out and return -1.
This method expects the data to be in UTF-8 and accesses it as bytes. Thus we can just ignore any bytes with the highest bit set.
Parse the JSON constant "true".
Parse the JSON constant "true".
The reset() method is used to signal that we're working from the given position, and any previous data can be released.
The reset() method is used to signal that we're working from the given position, and any previous data can be released. Some parsers (e.g. StringParser) will ignore release, while others (e.g. PathParser) will need to use this information to release and allocate different areas.
Tail-recursive parsing method to do the bulk of JSON parsing.
Tail-recursive parsing method to do the bulk of JSON parsing.
This single method manages parser states, data, etc. Except for parsing non-recursive values (like strings, numbers, and constants) all important work happens in this loop (or in methods it calls, like reset()).
Currently the code is optimized to make use of switch statements. Future work should consider whether this is better or worse than manually constructed if/else statements or something else. Also, it may be possible to reorder some cases for speed improvements.
AsyncParser is able to parse chunks of data (encoded as Option[ByteBuffer] instances) and parse asynchronously. You can use the factory methods in the companion object to instantiate an async parser.
The async parser's fields are described below:
The (state, curr, stack) triple is used to save and restore parser state between async calls. State also helps encode extra information when streaming or unwrapping an array.
The (data, len, allocated) triple is used to manage the underlying data the parser is keeping track of. As new data comes in, data may be expanded if not enough space is available.
The offset parameter is used to drive the outer async parsing. It stores similar information to curr but is kept separate to avoid "corrupting" our snapshot.
The done parameter is used internally to help figure out when the atEof() parser method should return true. This will be set when apply(None) is called.
The streamMode parameter controls how the asynchronous parser will be handling multiple values. There are three states:
1: An array is being unwrapped. Normal JSON array rules apply (Note that if the outer value observed is not an array, this mode will toggle to the -1 mode).
0: A stream of individual JSON elements separated by whitespace are being parsed. We can return each complete element as we parse it.
-1: No streaming is occuring. Only a single JSON value is allowed.