nonlexeme
This object is concerned with non-lexemes: these are tokens that do not give any special treatment to whitespace.
Whilst the functionality in lexeme
is strongly recommended for wider use in a parser, the functionality here may be useful for more specialised use-cases. In particular, these may for the building blocks for more complex tokens (where whitespace is not allowed between them, say), in which case these compound tokens can be turned into lexemes manually. For example, the lexer does not have configuration for trailing specifiers on numeric literals (like, 1024L
in Scala, say): the desired numeric literal parser could be extended with this functionality before whitespace is consumed by using the variant found in this object.
Alternatively, these tokens can be used for lexical extraction, which can be performed by the ErrorBuilder
typeclass: this can be used to try and extract tokens from the input stream when an error happens, to provide a more informative error. In this case, it is desirable to not consume whitespace after the token to keep the error tight and precise.
Attributes
- Since
-
4.0.0
- Source
- Lexer.scala
- Graph
-
- Supertypes
- Self type
-
nonlexeme.type
Members list
Value members
Concrete methods
This is a collection of parsers concerned with handling character literals.
This is a collection of parsers concerned with handling character literals.
Character literals are described generally as follows:
desc.textDesc.characterLiteralEnd
: the character that starts and ends the literal (for example in many languages this is'
)desc.textDesc.graphicCharacter
: describes the legal characters that may appear in the literal directly. Usually, this excludes control characters and newlines, but permits most other things. Escape sequences can represent non-graphic charactersdesc.textDesc.escapeSequences
: describes the legal escape sequences that that can appear in a character literal (for example\n
or\u000a
)
Aside from the generic configuration, characters can be parsed in accordance with varying levels of unicode support, from ASCII-only to full UTF-16 characters. Parsers for each of four different vareties are exposed by this object.
Attributes
- Since
-
4.5.0
- Source
- Lexer.scala
This is a collection of parsers concerned with handling signed real numbers (like floats and doubles).
This is a collection of parsers concerned with handling signed real numbers (like floats and doubles).
These literals consist of a (possibly optional) integer prefix, with at least one of a fractional component (with .
) or an exponential component.
Real numbers are an extension of signed integers with the following additional configuration:
desc.numericDesc.leadingDotAllowed
: determines whether a literal like.0
would be considered legaldesc.numericDesc.trailingDotAllowed
: determines whether a literal like0.
would be considered legaldesc.numericDesc.realNumbersCanBe{Hexadecimal/Octal/Binary}
: these flags control what kind of literals can appear within thenumber
parser. Each type of literal may still be individually parsed with its corresponding parser, regardless of the value of the flagdesc.numericDesc.{decimal/hexadecimal/octal/binary}ExponentDesc
: describes how the exponential syntax works for each kind of base. If the syntax is legal, then this describes: which characters start it (classically, this would bee
orE
for decimals); whether or not it is compulsory for the literal (in Java and C, hexadecimal floats are only valid when they have an exponent attached); and whether or not a+
sign is mandatory, optional, or illegal for positive exponents
Additional to the parsing of decimal, hexadecimal, octal, and binary floating literals, each parser can be given a precision of IEEE 754 float or double. This can either be achieved by rounding to the nearest representable value, or by ensuring that the literal must be precisely representable as one of these numbers (which is defined as being one of binary, decimal or exact float
and double
values as described by Java)
Attributes
- See also
- Since
-
4.5.0
- Note
-
alias for
real
- Source
- Lexer.scala
This is a collection of parsers concerned with handling signed integer literals.
This is a collection of parsers concerned with handling signed integer literals.
Signed integer literals are an extension of unsigned integer literals with the following extra configuration:
desc.numericDesc.positiveSign
: describes whether or not literals are allowed to omit+
for positive literals, must write a+
, or can never write a+
.
Attributes
- See also
-
natural
for a full description of integer configuration - Since
-
4.5.0
- Source
- Lexer.scala
This is a collection of parsers concerned with handling multi-line string literals.
This is a collection of parsers concerned with handling multi-line string literals.
String literals are described generally as follows:
desc.textDesc.multiStringEnds
: the sequence of characters that can begin or end a multi-line string literal. Regardless of which of these is used for a specific literal, the end of the literal must use the same sequencedesc.textDesc.graphicCharacter
: describes the legal characters that may appear in the literal directly. Usually, this excludes control characters and newlines, but permits most other things. Escape sequences can represent non-graphic characters for non-raw stringsdesc.textDesc.escapeSequences
: describes the legal escape sequences that that can appear in a string literal (for example\n
or\u000a
)
Attributes
- Since
-
4.5.0
- Source
- Lexer.scala
This is a collection of parsers concerned with handling unsigned (positive) integer literals.
This is a collection of parsers concerned with handling unsigned (positive) integer literals.
Natural numbers are described generally as follows:
desc.numericDesc.literalBreakChar
: determines whether or not it is legal to "break up" the digits within a literal, for example: is1_000_000
allowed? If this is legal, describes what the break character is, and whether it can appear after a hexadecimal/octal/binary prefixdesc.numericDesc.leadingZerosAllowed
: determines whether or not it is possible to add extraneous zero digits onto the front of a number or not. In some languages, like C, this is disallowed, as numbers starting with0
are octal numbers.desc.numericDesc.integerNumbersCanBe{Hexadecimal/Octal/Binary}
: these flags control what kind of literals can appear within thenumber
parser. Each type of literal can be individually parsed with its corresponding parser, regardless of the value of the flagdesc.numericDesc.{hexadecimal/octal/binary}Leads
: controls what character must follow a0
when starting a number to change it from decimal into another base. This set may be empty, in which case the literal is described purely with leading zero (C style octals would setoctalLeads
toSet.empty
)
Additional to the parsing of decimal, hexadecimal, octal, and binary literals, each parser can be given a bit-width from 8- to 64-bit: this will check the parsed literal to ensure it is a legal literal of that size.
Attributes
- Since
-
4.5.0
- Source
- Lexer.scala
This is a collection of parsers concerned with handling multi-line string literals.
This is a collection of parsers concerned with handling multi-line string literals.
String literals are described generally as follows:
desc.textDesc.multiStringEnds
: the sequence of characters that can begin or end a multi-line string literal. Regardless of which of these is used for a specific literal, the end of the literal must use the same sequencedesc.textDesc.graphicCharacter
: describes the legal characters that may appear in the literal directly. Usually, this excludes control characters and newlines, but permits most other things. Escape sequences can represent non-graphic characters for non-raw stringsdesc.textDesc.escapeSequences
: describes the legal escape sequences that that can appear in a string literal (for example\n
or\u000a
)
Attributes
- Since
-
4.5.0
- Note
-
this will be parsed without handling any escape sequences, this includes literal-end characters and the escape prefix (often
"
and\
respectively) - Source
- Lexer.scala
This is a collection of parsers concerned with handling single-line string literals.
This is a collection of parsers concerned with handling single-line string literals.
String literals are described generally as follows:
desc.textDesc.stringEnds
: the sequence of characters that can begin or end a string literal. Regardless of which of these is used for a specific literal, the end of the literal must use the same sequencedesc.textDesc.graphicCharacter
: describes the legal characters that may appear in the literal directly. Usually, this excludes control characters and newlines, but permits most other things. Escape sequences can represent non-graphic characters for non-raw stringsdesc.textDesc.escapeSequences
: describes the legal escape sequences that that can appear in a string literal (for example\n
or\u000a
)
Attributes
- Since
-
4.5.0
- Note
-
this will be parsed without handling any escape sequences, this includes literal-end characters and the escape prefix (often
"
and\
respectively) - Source
- Lexer.scala
This is a collection of parsers concerned with handling signed real numbers (like floats and doubles).
This is a collection of parsers concerned with handling signed real numbers (like floats and doubles).
These literals consist of a (possibly optional) integer prefix, with at least one of a fractional component (with .
) or an exponential component.
Real numbers are an extension of signed integers with the following additional configuration:
desc.numericDesc.leadingDotAllowed
: determines whether a literal like.0
would be considered legaldesc.numericDesc.trailingDotAllowed
: determines whether a literal like0.
would be considered legaldesc.numericDesc.realNumbersCanBe{Hexadecimal/Octal/Binary}
: these flags control what kind of literals can appear within thenumber
parser. Each type of literal may still be individually parsed with its corresponding parser, regardless of the value of the flagdesc.numericDesc.{decimal/hexadecimal/octal/binary}ExponentDesc
: describes how the exponential syntax works for each kind of base. If the syntax is legal, then this describes: which characters start it (classically, this would bee
orE
for decimals); whether or not it is compulsory for the literal (in Java and C, hexadecimal floats are only valid when they have an exponent attached); and whether or not a+
sign is mandatory, optional, or illegal for positive exponents
Additional to the parsing of decimal, hexadecimal, octal, and binary floating literals, each parser can be given a precision of IEEE 754 float or double. This can either be achieved by rounding to the nearest representable value, or by ensuring that the literal must be precisely representable as one of these numbers (which is defined as being one of binary, decimal or exact float
and double
values as described by Java)
Attributes
- See also
- Since
-
4.5.0
- Source
- Lexer.scala
This is a collection of parsers concerned with handling signed integer literals.
This is a collection of parsers concerned with handling signed integer literals.
Signed integer literals are an extension of unsigned integer literals with the following extra configuration:
desc.numericDesc.positiveSign
: describes whether or not literals are allowed to omit+
for positive literals, must write a+
, or can never write a+
.
Attributes
- See also
-
unsigned
for a full description of signed integer configuration - Since
-
4.5.0
- Note
-
alias for
integer
- Source
- Lexer.scala
This is a collection of parsers concerned with handling numeric literals that may either be signed integers or signed reals.
This is a collection of parsers concerned with handling numeric literals that may either be signed integers or signed reals.
There is no additional configuration offered over that found in integer
or real
.
the bit-bounds and precision of the integer or real parts of the result can be specified in any pairing.
Attributes
- Since
-
4.5.0
- Source
- Lexer.scala
This is a collection of parsers concerned with handling single-line string literals.
This is a collection of parsers concerned with handling single-line string literals.
String literals are described generally as follows:
desc.textDesc.stringEnds
: the sequence of characters that can begin or end a string literal. Regardless of which of these is used for a specific literal, the end of the literal must use the same sequencedesc.textDesc.graphicCharacter
: describes the legal characters that may appear in the literal directly. Usually, this excludes control characters and newlines, but permits most other things. Escape sequences can represent non-graphic characters for non-raw stringsdesc.textDesc.escapeSequences
: describes the legal escape sequences that that can appear in a string literal (for example\n
or\u000a
)
Attributes
- Since
-
4.5.0
- Source
- Lexer.scala
This is a collection of parsers concerned with handling unsigned (positive) integer literals.
This is a collection of parsers concerned with handling unsigned (positive) integer literals.
Natural numbers are described generally as follows:
desc.numericDesc.literalBreakChar
: determines whether or not it is legal to "break up" the digits within a literal, for example: is1_000_000
allowed? If this is legal, describes what the break character is, and whether it can appear after a hexadecimal/octal/binary prefixdesc.numericDesc.leadingZerosAllowed
: determines whether or not it is possible to add extraneous zero digits onto the front of a number or not. In some languages, like C, this is disallowed, as numbers starting with0
are octal numbers.desc.numericDesc.integerNumbersCanBe{Hexadecimal/Octal/Binary}
: these flags control what kind of literals can appear within thenumber
parser. Each type of literal can be individually parsed with its corresponding parser, regardless of the value of the flagdesc.numericDesc.{hexadecimal/octal/binary}Leads
: controls what character must follow a0
when starting a number to change it from decimal into another base. This set may be empty, in which case the literal is described purely with leading zero (C style octals would setoctalLeads
toSet.empty
)
Additional to the parsing of decimal, hexadecimal, octal, and binary literals, each parser can be given a bit-width from 8- to 64-bit: this will check the parsed literal to ensure it is a legal literal of that size.
Attributes
- Since
-
4.5.0
- Note
-
alias for
natural
. - Source
- Lexer.scala
This is a collection of parsers concerned with handling numeric literals that may either be unsigned integers or unsigned reals.
This is a collection of parsers concerned with handling numeric literals that may either be unsigned integers or unsigned reals.
There is no additional configuration offered over that found in natural
or real
.
the bit-bounds and precision of the integer or real parts of the result can be specified in any pairing.
Attributes
- Since
-
4.5.0
- Source
- Lexer.scala
Concrete fields
This object contains lexing functionality relevant to the parsing of names, which include operators or identifiers.
This object contains lexing functionality relevant to the parsing of names, which include operators or identifiers.
The parsing of names is mostly concerned with finding the longest valid name that is not a reserved name, such as a hard keyword or a special operator.
Attributes
- Since
-
4.0.0
- Source
- Lexer.scala
This object contains lexing functionality relevant to the parsing of atomic symbols.
This object contains lexing functionality relevant to the parsing of atomic symbols.
Symbols are characterised by their "unitness", that is, every parser inside returns Unit
. This is because they all parse a specific known entity, and, as such, the result of the parse is irrelevant. These can be things such as reserved names, or small symbols like parentheses. This object also contains a means of creating new symbols as well as implicit conversions to allow for Scala's string literals to serve as symbols within a parser.
Attributes
- Since
-
4.0.0
- Source
- Lexer.scala