MarkupParser

scala.xml.parsing.MarkupParser
trait MarkupParser extends TokenTests

An XML parser.

Parses XML 1.0, invokes callback methods of a MarkupHandler and returns whatever the markup handler returns. Use ConstructingParser if you just want to parse XML to construct instances of scala.xml.Node.

While XML elements are returned, DTD declarations - if handled - are collected using side-effects.

Attributes

Graph
Supertypes
trait TokenTests
class Object
trait Matchable
class Any
Known subtypes
Self type

Members list

Type members

Types

override type ElementType = NodeSeq
override type InputType = Source
override type PositionType = Int

Value members

Abstract methods

def externalSource(systemLiteral: String): Source

Concrete methods

def appendText(pos: Int, ts: NodeBuffer, txt: String): Unit
def attrDecl(): Unit
<! attlist := ATTLIST

Attributes

override def ch: Char

The library and compiler parsers had the interesting distinction of different behavior for nextch (a function for which there are a total of two plausible behaviors, so we know the design space was fully explored.) One of them returned the value of nextch before the increment and one of them the new value. So to unify code we have to at least temporarily abstract over the nextchs.

The library and compiler parsers had the interesting distinction of different behavior for nextch (a function for which there are a total of two plausible behaviors, so we know the design space was fully explored.) One of them returned the value of nextch before the increment and one of them the new value. So to unify code we have to at least temporarily abstract over the nextchs.

Attributes

Definition Classes
MarkupParserCommon
override protected def ch_returning_nextch: Char

Attributes

Definition Classes
MarkupParserCommon
content1 ::=  '<' content1 | '&' charref ...

Attributes

'<' content1 ::=  ...

Attributes

[22]     prolog      ::= XMLDecl? Misc* (doctypedecl Misc*)?
[23]     XMLDecl     ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
[24]     VersionInfo ::= S 'version' Eq ("'" VersionNum "'" | '"' VersionNum '"')
[25]     Eq          ::= S? '=' S?
[26]     VersionNum  ::= '1.0'
[27]     Misc        ::= Comment | PI | S

Attributes

'<' element ::= xmlTag1 '>'  { xmlExpr | '{' simpleExpr '}' } ETag
             | xmlTag1 '/' '>'

Attributes

def elementDecl(): Unit

<! element := ELEMENT

<! element := ELEMENT

Attributes

def entityDecl(): Unit
<! element := ELEMENT

Attributes

override def eof: Boolean

Attributes

Definition Classes
MarkupParserCommon
override def errorNoEnd(tag: String): Nothing

Attributes

Definition Classes
MarkupParserCommon
def extSubset(): Unit
externalID ::= SYSTEM S syslit
               PUBLIC S pubid S syslit

Attributes

def initialize: this.type

As the current code requires you to call nextch once manually after construction, this method formalizes that suboptimal reality.

As the current code requires you to call nextch once manually after construction, this method formalizes that suboptimal reality.

Attributes

def intSubset(): Unit

"rec-xml/#ExtSubset" pe references may not occur within markup declarations

"rec-xml/#ExtSubset" pe references may not occur within markup declarations

Attributes

override def lookahead(): BufferedIterator[Char]

Create a lookahead reader which does not influence the input

Create a lookahead reader which does not influence the input

Attributes

Definition Classes
MarkupParserCommon
def markupDecl(): Unit
def markupDecl1(): Any
override def mkAttributes(name: String, pscope: NamespaceBinding): AttributesType

Attributes

Definition Classes
MarkupParserCommon
override def mkProcInstr(position: Int, name: String, text: String): ElementType

Attributes

Definition Classes
MarkupParserCommon
override def nextch(): Unit

this method tells ch to get the next character when next called

this method tells ch to get the next character when next called

Attributes

Definition Classes
MarkupParserCommon
'N' notationDecl ::= "OTATION"

Attributes

def parseDTD(): Unit

parses document type declaration and assigns it to instance variable dtd.

parses document type declaration and assigns it to instance variable dtd.

<! parseDTD ::= DOCTYPE name ... >

Attributes

def pop(): Unit
<? prolog ::= xml S?
// this is a bit more lenient than necessary...

Attributes

[12]       PubidLiteral ::=        '"' PubidChar* '"' | "'" (PubidChar - "'")* "'"

Attributes

def push(entityName: String): Unit
def pushExternal(systemId: String): Unit
protected def putChar(c: Char): StringBuilder

append Unicode character to name buffer

append Unicode character to name buffer

Attributes

override def reportSyntaxError(pos: Int, str: String): Unit

Attributes

Definition Classes
MarkupParserCommon
override def reportSyntaxError(str: String): Unit

Attributes

Definition Classes
MarkupParserCommon

attribute value, terminated by either ' or ". value may not contain <.

attribute value, terminated by either ' or ". value may not contain <.

     AttValue     ::= `'` { _ } `'`
                    | `"` { _ } `"`

Attributes

prolog, but without standalone

prolog, but without standalone

Attributes

override def truncatedError(msg: String): Nothing

Attributes

Definition Classes
MarkupParserCommon

parse attribute and create namespace scope, metadata

parse attribute and create namespace scope, metadata

[41] Attributes    ::= { S Name Eq AttValue }

Attributes

'<! CharData ::= [CDATA[ ( {char} - {char}"]]>"{char} ) ']]>'

see [15]

Attributes

Comment ::= ''

see [15]

Attributes

entity value, terminated by either ' or ". value may not contain <.

entity value, terminated by either ' or ". value may not contain <.

     AttValue     ::= `'` { _  } `'`
                    | `"` { _ } `"`

Attributes

override def xHandleError(that: Char, msg: String): Unit

Attributes

Definition Classes
MarkupParserCommon
<? prolog ::= xml S ... ?>

Attributes

Inherited methods

Attributes

Inherited from:
TokenTests

Attributes

Inherited from:
TokenTests
protected def errorAndResult[T](msg: String, x: T): T

Attributes

Inherited from:
MarkupParserCommon (hidden)
def isAlpha(c: Char): Boolean

These are 99% sure to be redundant but refactoring on the safe side.

These are 99% sure to be redundant but refactoring on the safe side.

Attributes

Inherited from:
TokenTests

Attributes

Inherited from:
TokenTests
def isName(s: String): Boolean

See [5] of XML 1.0 specification.

Name ::= ( Letter | '_' ) (NameChar)*

See [5] of XML 1.0 specification.

Attributes

Inherited from:
TokenTests
def isNameChar(ch: Char): Boolean

See [4] and [4a] of Appendix B of XML 1.0 specification.

NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | #xB7
           | CombiningChar | Extender

See [4] and [4a] of Appendix B of XML 1.0 specification.

Attributes

Inherited from:
TokenTests

where Letter means in one of the Unicode general categories { Ll, Lu, Lo, Lt, Nl }.

NameStart ::= ( Letter | '_' | ':' )

where Letter means in one of the Unicode general categories { Ll, Lu, Lo, Lt, Nl }.

We do not allow a name to start with :. See [4] and Appendix B of XML 1.0 specification

Attributes

Inherited from:
TokenTests

Attributes

Inherited from:
TokenTests
final def isSpace(cs: Seq[Char]): Boolean
(#x20 | #x9 | #xD | #xA)+

Attributes

Inherited from:
TokenTests
final def isSpace(ch: Char): Boolean
(#x20 | #x9 | #xD | #xA)

Attributes

Inherited from:
TokenTests
def isValidIANAEncoding(ianaEncoding: Seq[Char]): Boolean

Returns true if the encoding name is a valid IANA encoding. This method does not verify that there is a decoder available for this encoding, only that the characters are valid for an IANA encoding name.

Returns true if the encoding name is a valid IANA encoding. This method does not verify that there is a decoder available for this encoding, only that the characters are valid for an IANA encoding name.

Value parameters

ianaEncoding

The IANA encoding name.

Attributes

Inherited from:
TokenTests
def returning[T](x: T)(f: T => Unit): T

Apply a function and return the passed value

Apply a function and return the passed value

Attributes

Inherited from:
MarkupParserCommon (hidden)
def saving[A, B](getter: A, setter: A => Unit)(body: => B): B

Execute body with a variable saved and restored after execution

Execute body with a variable saved and restored after execution

Attributes

Inherited from:
MarkupParserCommon (hidden)
protected def unreachable: Nothing

Attributes

Inherited from:
MarkupParserCommon (hidden)

Attributes

Inherited from:
MarkupParserCommon (hidden)
def xAttributeValue(endCh: Char): String

attribute value, terminated by either ' or ". value may not contain <.

attribute value, terminated by either ' or ". value may not contain <.

Value parameters

endCh

either ' or "

Attributes

Inherited from:
MarkupParserCommon (hidden)

Attributes

Inherited from:
MarkupParserCommon (hidden)

Attributes

Inherited from:
MarkupParserCommon (hidden)
def xCharRef(ch: () => Char, nextch: () => Unit): String

CharRef ::= "&#" '0'..'9' {'0'..'9'} ";" | "&#x" '0'..'9'|'A'..'F'|'a'..'f' { hexdigit } ";"

CharRef ::= "&#" '0'..'9' {'0'..'9'} ";" | "&#x" '0'..'9'|'A'..'F'|'a'..'f' { hexdigit } ";"

see [66]

Attributes

Inherited from:
MarkupParserCommon (hidden)
def xEQ(): Unit

scan [S] '=' [S]

scan [S] '=' [S]

Attributes

Inherited from:
MarkupParserCommon (hidden)
def xEndTag(startName: String): Unit

[42] '<' xmlEndTag ::= '<' '/' Name S? '>'

[42] '<' xmlEndTag ::= '<' '/' Name S? '>'

Attributes

Inherited from:
MarkupParserCommon (hidden)
def xName: String

actually, Name ::= (Letter | '_' | ':') (NameChar)* but starting with ':' cannot happen Name ::= (Letter | '_') (NameChar)*

actually, Name ::= (Letter | '_' | ':') (NameChar)* but starting with ':' cannot happen Name ::= (Letter | '_') (NameChar)*

see [5] of XML 1.0 specification

pre-condition: ch != ':' // assured by definition of XMLSTART token post-condition: name does neither start, nor end in ':'

Attributes

Inherited from:
MarkupParserCommon (hidden)
def xProcInstr: ElementType

'?' {Char})]'?>'

'?' {Char})]'?>'

see [15]

Attributes

Inherited from:
MarkupParserCommon (hidden)
def xSpace(): Unit

scan [3] S ::= (#x20 | #x9 | #xD | #xA)+

scan [3] S ::= (#x20 | #x9 | #xD | #xA)+

Attributes

Inherited from:
MarkupParserCommon (hidden)
def xSpaceOpt(): Unit

skip optional space S?

skip optional space S?

Attributes

Inherited from:
MarkupParserCommon (hidden)
protected def xTag(pscope: NamespaceType): (String, AttributesType)

parse a start or empty tag. [40] STag ::= '<' Name { S Attribute } [S] [44] EmptyElemTag ::= '<' Name { S Attribute } [S]

parse a start or empty tag. [40] STag ::= '<' Name { S Attribute } [S] [44] EmptyElemTag ::= '<' Name { S Attribute } [S]

Attributes

Inherited from:
MarkupParserCommon (hidden)
protected def xTakeUntil[T](handler: (PositionType, String) => T, positioner: () => PositionType, until: String): T

Take characters from input stream until given String "until" is seen. Once seen, the accumulated characters are passed along with the current Position to the supplied handler function.

Take characters from input stream until given String "until" is seen. Once seen, the accumulated characters are passed along with the current Position to the supplied handler function.

Attributes

Inherited from:
MarkupParserCommon (hidden)
def xToken(that: Seq[Char]): Unit

Attributes

Inherited from:
MarkupParserCommon (hidden)
def xToken(that: Char): Unit

Attributes

Inherited from:
MarkupParserCommon (hidden)

Abstract fields

val input: Source

if true, does not remove surplus whitespace

if true, does not remove surplus whitespace

Attributes

Concrete fields

protected val cbuf: StringBuilder

character buffer, for names

character buffer, for names

Attributes

protected var curInput: Source
protected var doc: Document
var dtd: DTD
var extIndex: Int

stack of inputs

stack of inputs

Attributes

holds the next character

holds the next character

Attributes

var pos: Int

holds the position in the source file

holds the position in the source file

Attributes

var tmppos: Int

holds temporary values of pos

holds temporary values of pos

Attributes