character

parsley.character
object character

This module contains many parsers to do with reading one or more characters. Almost every parser will need something from this module.

In particular, this module contains: combinators that can read specific characters; combinators that represent character classes and their negations; combinators for reading specific strings; as well as a selection of pre-made parsers to parse specific kinds of character, like digits and letters.

Attributes

Since

2.2.0

Source
character.scala
Graph
Supertypes
class Object
trait Matchable
class Any
Self type
character.type

Members list

Grouped members

Core Combinators and Parsers

These are the most primitive combinators for consuming input capable of any input reading tasks.

def char(c: Char): Parsley[Char]

This combinator tries to parse a single specific character c from the input.

This combinator tries to parse a single specific character c from the input.

Attempts to read the given character c from the input stream at the current position. If this character can be found, it is consumed and returned. Otherwise, no input is consumed and this combinator will fail.

Value parameters

c

the character to parse

Attributes

Returns

a parser that tries to read a single c, or fails.

Note

this combinator can only handle 16-bit characters: for larger codepoints, consider using string or unicode.char.

Example

scala> import parsley.character.char
scala> char('a').parse("")
val res0 = Failure(..)
scala> char('a').parse("a")
val res1 = Success('a')
scala> char('a').parse("ba")
val res2 = Failure(..)
Source
character.scala
val item: Parsley[Char]

This parser will parse any single character from the input, failing if there is no input remaining.

This parser will parse any single character from the input, failing if there is no input remaining.

Attributes

Note

this combinator can only handle 16-bit characters: for larger codepoints, consider using unicode.item.

Source
character.scala
def satisfy(pred: Char => Boolean): Parsley[Char]

This combinator tries to parse a single character from the input that matches the given predicate.

This combinator tries to parse a single character from the input that matches the given predicate.

Attempts to read a character from the input and tests it against the predicate pred. If a character c can be read and pred(c) is true, then c is consumed and returned. Otherwise, no input is consumed and this combinator will fail.

Value parameters

pred

the predicate to test the next character against, should one exist.

Attributes

Returns

a parser that tries to read a single character c, such that pred(c) is true, or fails.

Note

this combinator can only handle 16-bit characters: for larger codepoints, consider using unicode.satisfy.

Example

scala> import parsley.character.satisfy
scala> satisfy(_.isDigit).parse("")
val res0 = Failure(..)
scala> satisfy(_.isDigit).parse("7")
val res1 = Success('7')
scala> satisfy(_.isDigit).parse("a5")
val res2 = Failure(..)
scala> def char(c: Char): Parsley[Char] = satisfy(_ == c)
Source
character.scala
def satisfyMap[A](f: PartialFunction[Char, A]): Parsley[A]

This combinator tries to parse and process a character from the input if it is defined for the given function.

This combinator tries to parse and process a character from the input if it is defined for the given function.

Attempts to read a character from the input and tests to see if it is in the domain of f. If a character c can be read and f(c) is defined, then c is consumed and f(c) is returned. Otherwise, no input is consumed and this combinator will fail.

Value parameters

f

the function to test the next character against and transform it with, should one exist.

Attributes

Returns

a parser that tries to read a single character c, such that f(c) is defined, and returns f(c) if so, or fails.

Since

4.4.0

Note

this combinator can only handle 16-bit characters: for larger codepoints, consider using unicode.satisfyMap.

Example

scala> import parsley.character.satisfyMap
scala> val digit = satisfyMap {
 case c if c.isDigit => c.asDigit
}
scala> digit.parse("")
val res0 = Failure(..)
scala> digit.parse("7")
val res1 = Success(7)
scala> digit.parse("a5")
val res2 = Failure(..)
Source
character.scala

Character Class Combinators

These combinators allow for working with character classes. This means that a set, or range, of characters can be specified, and the combinator will return a parser that matches one of those characters (or conversely, any character that is not in that set). The parsed character is always returned.

def noneOf(cs: Set[Char]): Parsley[Char]

This combinator tries to parse any character not from supplied set of characters cs, returning it if successful.

This combinator tries to parse any character not from supplied set of characters cs, returning it if successful.

If the next character in the input is not a member of the set cs, it is consumed and returned. Otherwise, no input is consumed and the combinator fails.

Value parameters

cs

the set of characters to check.

Attributes

Returns

a parser that parses one character that is not a member of the set cs.

See also

satisfy

Note

this combinator can only handle 16-bit characters: for larger codepoints, consider using unicode.noneOf.

Example

scala> import parsley.character.noneOf
scala> val p = noneOf(Set('a', 'b', 'c'))
scala> p.parse("a")
val res0 = Failure(..)
scala> p.parse("c")
val res1 = Failure(..)
scala> p.parse("xb")
val res2 = Success('x')
scala> p.parse("")
val res3 = Failure(..)
Source
character.scala
def noneOf(cs: Char*): Parsley[Char]

This combinator tries to parse any character not from supplied set of characters cs, returning it if successful.

This combinator tries to parse any character not from supplied set of characters cs, returning it if successful.

If the next character in the input is not an element of the list of characters cs, it is consumed and returned. Otherwise, no input is consumed and the combinator fails.

Value parameters

cs

the set of characters to check.

Attributes

Returns

a parser that parses one character that is not an element of cs.

See also

satisfy

Note

this combinator can only handle 16-bit characters: for larger codepoints, consider using unicode.noneOf.

Example

scala> import parsley.character.noneOf
scala> val p = noneOf('a', 'b', 'c')
scala> p.parse("a")
val res0 = Failure(..)
scala> p.parse("c")
val res1 = Failure(..)
scala> p.parse("xb")
val res2 = Success('x')
scala> p.parse("")
val res3 = Failure(..)
Source
character.scala

This combinator tries to parse any character not from supplied set of characters cs, returning it if successful.

This combinator tries to parse any character not from supplied set of characters cs, returning it if successful.

If the next character in the input is outside of the range of characters cs, it is consumed and returned. Otherwise, no input is consumed and the combinator fails.

Value parameters

cs

the range of characters to check.

Attributes

Returns

a parser that parses a character outside the range cs.

See also

satisfy

Note

this combinator can only handle 16-bit characters: for larger codepoints, consider using unicode.noneOf.

Example

scala> import parsley.character.noneOf
scala> val p = noneOf('a' to 'c')
scala> p.parse("a")
val res0 = Failure(..)
scala> p.parse("b")
val res1 = Failure(..)
scala> p.parse("c")
val res1 = Failure(..)
scala> p.parse("xb")
val res2 = Success('x')
scala> p.parse("")
val res3 = Failure(..)
Source
character.scala
def oneOf(cs: Set[Char]): Parsley[Char]

This combinator tries to parse any character from supplied set of characters cs, returning it if successful.

This combinator tries to parse any character from supplied set of characters cs, returning it if successful.

If the next character in the input is a member of the set cs, it is consumed and returned. Otherwise, no input is consumed and the combinator fails.

Value parameters

cs

the set of characters to check.

Attributes

Returns

a parser that parses one of the member of the set cs.

See also

satisfy

Note

this combinator can only handle 16-bit characters: for larger codepoints, consider using unicode.oneOf.

Example

scala> import parsley.character.oneOf
scala> val p = oneOf(Set('a', 'b', 'c'))
scala> p.parse("a")
val res0 = Success('a')
scala> p.parse("c")
val res1 = Success('c')
scala> p.parse("xb")
val res2 = Failure(..)
Source
character.scala
def oneOf(cs: Char*): Parsley[Char]

This combinator tries to parse any character from supplied set of characters cs, returning it if successful.

This combinator tries to parse any character from supplied set of characters cs, returning it if successful.

If the next character in the input is an element of the list of characters cs, it is consumed and returned. Otherwise, no input is consumed and the combinator fails.

Value parameters

cs

the characters to check.

Attributes

Returns

a parser that parses one of the elements of cs.

See also

satisfy

Note

this combinator can only handle 16-bit characters: for larger codepoints, consider using unicode.oneOf.

Example

scala> import parsley.character.oneOf
scala> val p = oneOf('a', 'b', 'c')
scala> p.parse("a")
val res0 = Success('a')
scala> p.parse("c")
val res1 = Success('c')
scala> p.parse("xb")
val res2 = Failure(..)
Source
character.scala

This combinator tries to parse any character from supplied set of characters cs, returning it if successful.

This combinator tries to parse any character from supplied set of characters cs, returning it if successful.

If the next character in the input is within the range of characters cs, it is consumed and returned. Otherwise, no input is consumed and the combinator fails.

Value parameters

cs

the range of characters to check.

Attributes

Returns

a parser that parses a character within the range cs.

See also

satisfy

Note

this combinator can only handle 16-bit characters: for larger codepoints, consider using unicode.oneOf.

Example

scala> import parsley.character.oneOf
scala> val p = oneOf('a' to 'c')
scala> p.parse("a")
val res0 = Success('a')
scala> p.parse("b")
val res1 = Success('b')
scala> p.parse("c")
val res1 = Success('c')
scala> p.parse("xb")
val res2 = Failure(..)
Source
character.scala

String Combinators

These combinators allow for working with, or building, strings. This means that they can parse specific strings, specific sets of strings, or can read characters repeatedly to generate strings. They are united in all returning String as their result.

This combinator attempts to parse a given string from the input, and fails otherwise.

This combinator attempts to parse a given string from the input, and fails otherwise.

Attempts to read the given string completely from the input at the current position. If the string is present, then the parser succeeds, and the entire string is consumed from the input. Otherwise, if the input has too few characters remaining, or not all the characters matched, the parser fails. On failure, all the characters that were matched are consumed from the input.

Value parameters

s

the string to be parsed from the input

Attributes

Returns

a parser that either parses the string s or fails at the first mismatched character.

Note

the error messages generated by string do not reflect how far into the input it managed to get: this is because the error being positioned at the start of the string is more natural. However, input will still be consumed for purposes of backtracking.

Example

scala> import parsley.character.string
scala> string("abc").parse("")
val res0 = Failure(..)
scala> string("abc").parse("abcd")
val res1 = Success("abc")
scala> string("abc").parse("xabc")
val res2 = Failure(..)
Source
character.scala

This combinator parses pc zero or more times, collecting its results into a string.

This combinator parses pc zero or more times, collecting its results into a string.

Parses pc repeatedly until it fails. The resulting characters are placed into a string, which is then returned. This is morally equivalent to many(pc).map(_.mkString), but it uses StringBuilder, which makes it much more efficient.

Value parameters

pc

the parser whose results make up the string

Attributes

Returns

a parser that parses a string whose letters consist of results from pc.

Since

4.0.0

Example

scala> import parsley.character.{letter, letterOrDigit, stringOfMany}
scala> import parsley.syntax.zipped.Zipped2
scala> val ident = (letter, stringOfMany(letterOrDigit)).zipped((c, s) => s"$c$s")
scala> ident.parse("abdc9d")
val res0 = Success("abdc9d")
scala> ident.parse("a")
val res1 = Success("a")
scala> ident.parse("9")
val res2 = Failure(..)
Source
character.scala

This combinator parses characters matching the given predicate zero or more times, collecting the results into a string.

This combinator parses characters matching the given predicate zero or more times, collecting the results into a string.

Repeatly reads characters that satisfy the given predicate pred. When no more characters can be successfully read, the results are stitched together into a String and returned. This combinator can never fail, since satisfy can never fail having consumed input.

Value parameters

pred

the predicate to test characters against.

Attributes

Returns

a parser that returns the span of characters satisfying pred

Since

4.4.0

Note

this acts exactly like stringOfMany(satisfy(pred)), but may be more efficient.

analogous to the megaparsec takeWhileP combinator.

Example

scala> import parsley.character.{letter, stringOfMany}
scala> import parsley.syntax.zipped.Zipped2
scala> val ident = (letter, stringOfMany(_.isLetterOrDigit)).zipped((c, s) => s"$c$s")
scala> ident.parse("abdc9d")
val res0 = Success("abdc9d")
scala> ident.parse("a")
val res1 = Success("a")
scala> ident.parse("9")
val res2 = Failure(..)
Source
character.scala

This combinator parses pc one or more times, collecting its results into a string.

This combinator parses pc one or more times, collecting its results into a string.

Parses pc repeatedly until it fails. The resulting characters are placed into a string, which is then returned. This is morally equivalent to many(pc).map(_.mkString), but it uses StringBuilder, which makes it much more efficient. The result string must have at least one character in it.

Value parameters

pc

the parser whose results make up the string

Attributes

Returns

a parser that parses a string whose letters consist of results from pc.

Since

4.0.0

Example

scala> import parsley.character.{letter, stringOfSome}
scala> val ident = stringOfSome(letter)
scala> ident.parse("abdc9d")
val res0 = Success("abdc")
scala> ident.parse("")
val res1 = Failure(..)
Source
character.scala

This combinator parses characters matching the given predicate one or more times, collecting the results into a string.

This combinator parses characters matching the given predicate one or more times, collecting the results into a string.

Repeatly reads characters that satisfy the given predicate pred. When no more characters can be successfully read, the results are stitched together into a String and returned. This combinator can never fail having consumed input, since satisfy can never fail having consumed input.

Value parameters

pred

the predicate to test characters against.

Attributes

Returns

a parser that returns the span of characters satisfying pred

Since

4.4.0

Note

this acts exactly like stringOfSome(satisfy(pred)), but may be more efficient.

analogous to the megaparsec takeWhile1P combinator.

Example

scala> import parsley.character.{stringOfSome}
scala> val ident = stringOfSome(_.isLetter)
scala> ident.parse("abdc9d")
val res0 = Success("abdc")
scala> ident.parse("")
val res1 = Failure(..)
Source
character.scala
def strings(str0: String, strs: String*): Parsley[String]

This combinator tries to parse each of the strings strs (and str0), until one of them succeeds.

This combinator tries to parse each of the strings strs (and str0), until one of them succeeds.

Unlike choice, or more accurately atomicChoice, this combinator will not necessarily parse the strings in the order provided. It will avoid strings that have another string as a prefix first, so that it has Longest Match semantics. It will try to minimise backtracking too, making it a much more efficient option than atomicChoice.

The longest succeeding string will be returned. If no strings match then the combinator fails.

Value parameters

str0

the first string to try to parse.

strs

the remaining strings to try to parse.

Attributes

Returns

a parser that tries to parse all the given strings returning the longest one that matches.

Since

4.0.0

Example

scala> import parsley.character.strings
scala> val p = strings("hell", "hello", "goodbye", "g", "abc")
scala> p.parse("hell")
val res0 = Success("hell")
scala> p.parse("hello")
val res1 = Success("hello")
scala> p.parse("good")
val res2 = Success("g")
scala> p.parse("goodbye")
val res3 = Success("goodbye")
scala> p.parse("a")
val res4 = Failure(..)
Source
character.scala
def strings[A](kv0: (String, Parsley[A]), kvs: (String, Parsley[A])*): Parsley[A]

This combinator tries to parse each of the key-value pairs kvs (and kv0), until one of them succeeds.

This combinator tries to parse each of the key-value pairs kvs (and kv0), until one of them succeeds.

Each argument to this combinator is a pair of a string and a parser to perform if that string can be parsed. strings(s0 -> p0, ...) can be thought of as atomicChoice(string(s0) *> p0, ...), however, the given ordering of key-value pairs does not dictate the order in which the parses are tried. In particular, it will avoid keys that are the prefix of another key first, so that it has Longest Match semantics. It will try to minimise backtracking too, making it a much more efficient option than atomicChoice.

Value parameters

kv0

the first key-value pair to try to parse.

kvs

the remaining key-value pairs to try to parse.

Attributes

Returns

a parser that tries to parse all the given key-value pairs, returning the (possibly failing) result of the value that corresponds to the longest matching key.

Since

4.0.0

Note

the scope of any backtracking performed is isolated to the key itself, as it is assumed that once a key parses correctly, the branch has been committed to. Putting an atomic around the values will not affect this behaviour.

Example

scala> import parsley.character.strings
scala> val p = strings("hell" -> pure(4), "hello" -> pure(5), "goodbye" -> pure(7), "g" -> pure(1), "abc" -> pure(3))
scala> p.parse("hell")
val res0 = Success(4)
scala> p.parse("hello")
val res1 = Success(5)
scala> p.parse("good")
val res2 = Success(1)
scala> p.parse("goodbye")
val res3 = Success(7)
scala> p.parse("a")
val res4 = Failure(..)
Source
character.scala

Specific Character Parsers

These parsers are special cases of satisfy or char. They are worth using, as they are given special error labelling, producing nicer error messages than their primitive counterparts. This documentation assumes JDK 17. JDK 17 is compliant with Unicode® Specification 13.0. As such, the descriptions of the parsers in this section are accurate with respect to Unicode® Specification 13.0: using a different JDK may affect the precise definitions of the parsers below. If in doubt, check the documentation for java.lang.Character to see which Unicode version is supported by your JVM. A table of the Unicode versions up to JDK 17 can be found here. These parsers are only able to parse unicode characters in the range '\u0000' to '\uffff', known as the Basic Multilingual Plane (BMP). Unicode characters wider than a single 16-bit character should be parsed using multi-character combinators such as string, or, alternatively, combinators found in unicode.

val bit: Parsley[Char]

This parser tries to parse a binary digit (bit) and returns it if successful.

This parser tries to parse a binary digit (bit) and returns it if successful.

A bit is either '0' or '1'.

Attributes

Source
character.scala
val crlf: Parsley[Char]

This parser tries to parse a CRLF newline character pair, returning '\n' if successful.

This parser tries to parse a CRLF newline character pair, returning '\n' if successful.

A CRLF character is the pair of carriage return ('\r') and line feed ('\n'). These two characters will be parsed together or not at all. The parser is made atomic using atomic.

Attributes

Source
character.scala

This parser tries to parse a digit, and returns it if successful.

This parser tries to parse a digit, and returns it if successful.

A digit is any character c <= '\uffff' whose Unicode Category Type is Decimal Number (Nd). Examples of (inclusive) ranges within this category include:

  • the Latin digits '0' through '9'

  • the Arabic-Indic digits '\u0660' through '\u0669'

  • the Extended Arabic-Indic digits '\u06f0' through '\u06f9'

  • the Devangari digits '\u0966' through '\u096f'

  • the Fullwidth digits '\uff10' through '\uff19'

The full list of codepoints found in a category can be found in the Unicode Character Database.

Attributes

Source
character.scala

This parser will parse either a line feed (LF) or a CRLF newline, returning '\n' if successful.

This parser will parse either a line feed (LF) or a CRLF newline, returning '\n' if successful.

Attributes

See also
Source
character.scala

This parser tries to parse a hexadecimal digit, and returns it if successful.

This parser tries to parse a hexadecimal digit, and returns it if successful.

A hexadecimal digit is one of (all inclusive ranges):

  1. the digits '0' through '9'

  2. the letters 'a' through 'f'

  3. the letters 'A' through 'Z'

Attributes

See also
Source
character.scala

This parser tries to parse a letter, and returns it if successful.

This parser tries to parse a letter, and returns it if successful.

A letter is any character c <= '\uffff' whose Unicode Category Type is any of the following:

  1. Uppercase Letter (Lu)

  2. Lowercase Letter (Ll)

  3. Titlecase Letter (Lt)

  4. Modifier Letter (Lm)

  5. Other Letter (Lo)

The full list of codepoints found in a category can be found in the Unicode Character Database.

Attributes

Source
character.scala

This parser tries to parse either a letter or a digit, and returns it if successful.

This parser tries to parse either a letter or a digit, and returns it if successful.

A letter or digit is anything that would parse in either letter or digit.

Attributes

See also

documentation for letter.

documentation for digit.

Source
character.scala

This parser tries to parse a lowercase letter, and returns it if successful.

This parser tries to parse a lowercase letter, and returns it if successful.

A lowercase letter is any character c <= '\uffff' whose Unicode Category Type is Lowercase Letter (Ll). Examples of characters within this category include:

  • the Latin letters 'a' through 'z'

  • Latin special character such as 'é', 'ß', 'ð'

  • Cryillic letters

  • Greek letters

  • Coptic letters

The full list of codepoints found in a category can be found in the Unicode Character Database.

Attributes

Source
character.scala

This parser tries to parse a line feed newline ('\n') character, and returns it if successful.

This parser tries to parse a line feed newline ('\n') character, and returns it if successful.

This parser will not accept a carriage return (CR) character or CRLF.

Attributes

Source
character.scala

This parser tries to parse an octal digit, and returns it if successful.

This parser tries to parse an octal digit, and returns it if successful.

An octal digit is one of '0' to '7' (inclusive).

Attributes

See also
Source
character.scala

This parser tries to parse a space or tab character, and returns it if successful.

This parser tries to parse a space or tab character, and returns it if successful.

Attributes

See also
Source
character.scala
val tab: Parsley[Char]

This parser tries to parse a tab ('\t') character, and returns it if successful.

This parser tries to parse a tab ('\t') character, and returns it if successful.

This parser does not recognise vertical tabs, only horizontal ones.

Attributes

Source
character.scala

This parser tries to parse an uppercase letter, and returns it if successful.

This parser tries to parse an uppercase letter, and returns it if successful.

An uppercase letter is any character c <= '\uffff' whose Unicode Category Type is Uppercase Letter (Lu). Examples of characters within this category include:

  • the Latin letters 'A' through 'Z'

  • Latin special character such as 'Å', 'Ç', 'Õ'

  • Cryillic letters

  • Greek letters

  • Coptic letters

The full list of codepoints found in a category can be found in the Unicode Character Database.

Attributes

Source
character.scala

This parser tries to parse a whitespace character, and returns it if successful.

This parser tries to parse a whitespace character, and returns it if successful.

A whitespace character is one of:

  1. a space (' ')

  2. a tab ('\t')

  3. a line feed ('\n')

  4. a carriage return ('\r')

  5. a form feed ('\f')

  6. a vertical tab ('\u000B')

Attributes

Source
character.scala

Whitespace Skipping Parsers

These parsers are designed to skip chunks of whitespace, for very rudimentary lexing tasks. It is probably better to use the functionality of parsley.token.

This parser skips zero or more space characters using space.

This parser skips zero or more space characters using space.

Attributes

Source
character.scala

This parser skips zero or more space characters using whitespace.

This parser skips zero or more space characters using whitespace.

Attributes

Source
character.scala

Character Predicates

These are useful for providing to the sub-descriptions of a token.descriptions.LexicalDesc to specify behaviour for the lexer. Other than that, they aren't particularly useful.

This function returns true if a character is a hexadecimal digit.

This function returns true if a character is a hexadecimal digit.

A hexadecimal digit is one of (all inclusive ranges):

  1. the digits '0' through '9'

  2. the letters 'a' through 'f'

  3. the letters 'A' through 'Z'

  4. an equivalent from another charset

Attributes

See also
Source
character.scala

This function returns true if a character is an octal digit.

This function returns true if a character is an octal digit.

An octal digit is one of '0' to '7' (inclusive).

Attributes

See also
Source
character.scala
def isSpace(c: Char): Boolean

This function returns true if a character is either a space or a tab character.

This function returns true if a character is either a space or a tab character.

Attributes

See also
Source
character.scala