Lexer

final class Lexer extends AnyRef

This class provides a large selection of functionality concerned with lexing.

This class provides lexing functionality to parsley, however it is guaranteed that nothing in this class is not implementable purely using parsley's pre-existing functionality. These are regular parsers, but constructed in such a way that they create a clear and logical separation from the rest of the parser.

The class is broken up into several internal "modules" that group together similar kinds of functionality. Importantly, the lexemes and nonlexemes objects separate the underlying token implementations based on whether or not they consume whitespace or not. Functionality is broadly duplicated across both of these modules: lexemes should be used by a wider parser, to ensure whitespace is handled uniformly; and nonlexemes should be used to define further composite tokens or in special circumstances where whitespace should not be consumed.

It is possible that some of the implementations of parsers found within this class may have been hand-optimised for performance: care will have been taken to ensure these implementations precisely match the semantics of the originals.

Source: Lexer.scala

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

Lexer
AnyRef
Any

Hide All
Show All

Visibility

Public
Protected

Instance Constructors

new Lexer(desc: LexicalDesc)
Builds a new lexer with a given description for the lexical structure of the language.
Builds a new lexer with a given description for the lexical structure of the language.
desc
the configuration for the lexer, specifying the lexical rules of the grammar/language being parsed.
Since
4.0.0
new Lexer(desc: LexicalDesc, errConfig: ErrorConfig)

Value Members

final def !=(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def ##: Int
Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0
Definition Classes
Any
def clone(): AnyRef
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.CloneNotSupportedException]) @native()
final def eq(arg0: AnyRef): Boolean
Definition Classes
AnyRef
def equals(arg0: AnyRef): Boolean
Definition Classes
AnyRef → Any
def finalize(): Unit
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.Throwable])
def fully[A](p: Parsley[A]): Parsley[A]
This combinator ensures a parser fully parses all available input, and consumes whitespace at the start.
This combinator ensures a parser fully parses all available input, and consumes whitespace at the start.
This combinator should be used once as the outermost combinator in a parser. It is the only combinator that should consume leading whitespace, and this must be the first thing a parser does. It will ensure that, after the parser is complete, the end of the input stream has been reached.
Since
4.0.0
final def getClass(): Class[_ <: AnyRef]
Definition Classes
AnyRef → Any
Annotations
@native()
def hashCode(): Int
Definition Classes
AnyRef → Any
Annotations
@native()
final def isInstanceOf[T0]: Boolean
Definition Classes
Any
final def ne(arg0: AnyRef): Boolean
Definition Classes
AnyRef
final def notify(): Unit
Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit
Definition Classes
AnyRef
Annotations
@native()
final def synchronized[T0](arg0: => T0): T0
Definition Classes
AnyRef
def toString(): String
Definition Classes
AnyRef → Any
final def wait(): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long, arg1: Int): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException]) @native()
object lexeme extends Lexeme
This object is concerned with lexemes: these are tokens that are treated as "words", such that whitespace will be consumed after each has been parsed.
This object is concerned with lexemes: these are tokens that are treated as "words", such that whitespace will be consumed after each has been parsed.
Ideally, a wider parser should not be concerned with handling whitespace, as it is responsible for dealing with a stream of tokens. With parser combinators, however, it is usually not the case that there is a separate distinction between the parsing phase and the lexing phase. That said, it is good practice to establish a logical separation between the two worlds. As such, this object contains parsers that parse tokens, and these are whitespace-aware. This means that whitespace will be consumed after any of these parsers are parsed. It is not, however, required that whitespace be present.
Since
4.0.0
object nonlexeme
This object is concerned with non-lexemes: these are tokens that do not give any special treatment to whitespace.
This object is concerned with non-lexemes: these are tokens that do not give any special treatment to whitespace.
Whilst the functionality in lexeme is strongly recommended for wider use in a parser, the functionality here may be useful for more specialised use-cases. In particular, these may for the building blocks for more complex tokens (where whitespace is not allowed between them, say), in which case these compound tokens can be turned into lexemes manually. For example, the lexer does not have configuration for trailing specifiers on numeric literals (like, 1024L in Scala, say): the desired numeric literal parser could be extended with this functionality before whitespace is consumed by using the variant found in this object.
Alternatively, these tokens can be used for lexical extraction, which can be performed by the ErrorBuilder typeclass: this can be used to try and extract tokens from the input stream when an error happens, to provide a more informative error. In this case, it is desirable to not consume whitespace after the token to keep the error tight and precise.
Since
4.0.0
object space
This object is concerned with special treatment of whitespace.
This object is concerned with special treatment of whitespace.
For the vast majority of cases, the functionality within this object shouldn't be needed, as whitespace is consistently handled by lexeme and fully. However, for grammars where whitespace is significant (like indentation-sensitive languages), this object provides some more fine-grained control over how whitespace is consumed by the parsers within lexeme.
Since
4.0.0

Packages

Package structure

Lexer

final class Lexer extends AnyRef

Instance Constructors

Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

Package structure

Lexer

final class Lexer extends AnyRef

Instance Constructors

Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Lexer