Object

colossus.parsing

Combinators

Related Doc: package parsing

Permalink

object Combinators

Streaming Parser Combinators

Overview

A Parser[T] is an object that consumes a stream of bytes to produce a result of type T.

A Combinator is a "higher-order" parser that takes one or more parsers to produce a new parser

The Stream parsers are very fast and efficient, but because of this they need to make some tradeoffs. They are mutable, not thread safe, and in general are designed for network protocols, which tend to have very deterministic grammars.

The Parser Rules:

1. A parser must greedily consume the data stream until it produces a result 2. When a parser consumes the last byte necessary to produce a result, it must stop consuming the stream and return the new result while resetting its state

Examples

Use any parser by itself:

val parser = bytes(4)
val data = DataBuffer(ByteString("aaaabbbbccc")
parser.parse(data) // Some(ByteString(97, 97, 97, 97))
parser.parse(data) >> {bytes => bytes.utf8String} // Some("bbbb")
parser.parse(data) // None

Combine two parsers

val parser = bytes(3) ~ bytes(2) >> {case a ~ b => a.ut8String + ":" + b.utf8String}
parser.parse(DataBuffer(ByteString("abc"))) // None
parser.parse(DataBuffer(ByteString("defgh"))) // Some("abc:de")
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. Combinators
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Visibility
  1. Public
  2. All

Type Members

  1. trait Parser[+T] extends AnyRef

    Permalink
  2. case class ~[+A, +B](a: A, b: B) extends Product with Serializable

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. val byte: Parser[Byte]

    Permalink

    parse a single byte

  6. def bytes(num: Int): Parser[ByteString]

    Permalink
  7. def bytes(num: Parser[Long]): Parser[ByteString]

    Permalink

    read a fixed number bytes, prefixed by a length

  8. def bytesUntil(terminus: ByteString): Parser[ByteString]

    Permalink

    Keep reading bytes until the terminus is encounted.

    Keep reading bytes until the terminus is encounted. This accounts for possible partial terminus in the data. The terminus is NOT included in the returned value

  9. def bytesUntilEOS: Parser[ByteString]

    Permalink

    Read in an unknown number of bytes, ended only when endOfStream is called

    Read in an unknown number of bytes, ended only when endOfStream is called

    be aware this parser has no max size and will read in data forever if endOfStream is never called

  10. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  11. def const[T](t: T): Parser[T]

    Permalink

    Creates a parser that will always return the same value without consuming any data.

    Creates a parser that will always return the same value without consuming any data. Useful when flatMapping parsers

  12. def delimitedString(delimiter: Byte, terminus: Byte): Parser[Vector[String]]

    Permalink

    Parse a series of ascii strings seperated by a single-byte delimiter and terminated by a byte

  13. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  14. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  15. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  16. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  17. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  18. def int: Parser[Int]

    Permalink
  19. def intUntil(terminus: Byte, base: Int = 10): Parser[Long]

    Permalink

    Parses the ASCII representation of an integer, keeps going until the terminus is encountered

  20. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  21. def literal(lit: ByteString): Parser[ByteString]

    Permalink
  22. def long: Parser[Long]

    Permalink
  23. def maxSize[T](size: DataSize, parser: Parser[T]): Parser[T]

    Permalink

    Creates a parser that wraps another parser and will throw an exception if more than size data is required to parse a single object.

    Creates a parser that wraps another parser and will throw an exception if more than size data is required to parse a single object. See the ParserSizeTracker for more details.

  24. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  25. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  26. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  27. def repeat[T](times: Long, parser: Parser[T]): Parser[Vector[T]]

    Permalink

    Repeat a pattern a fixed number of times

    Repeat a pattern a fixed number of times

    times

    the number of times to parse the pattern

    parser

    the parser for the pattern

    returns

    the parsed sequence

  28. def repeat[T](times: Parser[Long], parser: Parser[T]): Parser[Vector[T]]

    Permalink

    Parse a pattern multiple times based on a numeric prefix

    Parse a pattern multiple times based on a numeric prefix

    This is useful for any situation where the repeated pattern is prefixed by the number of repetitions, for example num:[obj1][obj2][obj3]. In situations where the pattern doesn't immediately follow the number, you'll have to do it yourself, something like

    intUntil(':') ~ otherParser |> {case num ~ other => repeat(num, patternParser)

    }

    intUntil(':') ~ otherParser |> {case num ~ other => repeat(num, patternParser) }}}

    times

    parser for the number of times to repeat the pattern

    parser

    the parser that will parse a single instance of the pattern

    returns

    the parsed sequence

  29. def repeatUntil[T](parser: Parser[T], terminus: Byte): Parser[Vector[T]]

    Permalink

    Repeatedly parse a pattern until a terminal byte is reached

    Repeatedly parse a pattern until a terminal byte is reached

    Before calling parser this will examine the next byte. If the byte matches the terminus, it will return the built sequence. Otherwise it will pass control to parser (including the examined byte) until the parser returns a result.

    Notice that the terminal byte is consumed, so if we have

    val parser = repeatUntil(bytes(2), ':')
    parser.parse(DataBuffer(ByteString("aabbcc:ddee")))

    the bytes remaining in the buffer after parsing are just ddee.

    parser

    the parser repeat

    terminus

    the byte to singal to stop repeating

    returns

    the parsed sequence

  30. def repeatUntilEOS[T](parser: Parser[T]): Parser[Seq[T]]

    Permalink

    Create a parser that will repeat the given parser forever until endOfStream() is called.

    Create a parser that will repeat the given parser forever until endOfStream() is called. The results from each call to the given parser are accumulated and returned at the end of the stream.

  31. def short: Parser[Short]

    Permalink
  32. def skip[T](n: Int): Parser[Unit]

    Permalink

    creates a parser that will skip over n bytes.

    creates a parser that will skip over n bytes. You generally only want to do this inside a peek parser

  33. def stringUntil(terminus: Byte, toLower: Boolean = false, minSize: Option[Int] = None, allowWhiteSpace: Boolean = true, ltrim: Boolean = false): Parser[String]

    Permalink

    Parse a string until a designated byte is encountered

    Parse a string until a designated byte is encountered

    Limited filtering is currently supported, all of which happens during the reading.

    terminus

    reading will stop when this byte is encountered

    toLower

    if true any characters in the range A-Z will be lowercased before insertion

    minSize

    specify a minimum size

    allowWhiteSpace

    throw a ParseException if any whitespace is encountered before the terminus. If the terminus is a whitespace character, it will not be counted

    ltrim

    trim leading whitespace

  34. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  35. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  36. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  37. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  38. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped