eu.cdevreeze

yaidom

package yaidom

Yaidom is yet another Scala immutable DOM-like XML API. The other well-known Scala immutable DOM-like APIs are the standard scala.xml API and Anti-XML. The latter API is considered by many to be an improvement over the former, but both APIs:

Yaidom takes a different approach, avoiding XPath-like query support, and offering good namespace and decent (functional) update support. Yaidom is also characterized by mathematical precision and clarity. Still, the API remains practical and pragmatic. In particular, the API user has much configuration control over parsing and serialization, because yaidom exposes the underlying JAXP parsers and serializers, which can be configured by the library user.

Yaidom chooses its battles. For example, given that DTDs do not know about namespaces, yaidom offers good namespace support, but ignores DTDs entirely. Of course the underlying XML parser may still validate XML against a DTD, if so desired. As another example, yaidom tries to leave the handling of the gory details of XML processing (such as whitespace handling) as much as possible to JAXP (and JAXP parser/serializer configuration). As yet another example, yaidom knows nothing about (XML Schema) types of elements and attributes.

Yaidom, and in particular this package, contains the following layers:

It makes sense to read this documentation, because it helps in getting up-to-speed with yaidom.

Basic concepts

In real world XML, elements (and sometimes attributes) tend to have names within a certain namespace. There are 2 kinds of names at play here:

They are represented by immutable classes eu.cdevreeze.yaidom.QName and eu.cdevreeze.yaidom.EName, respectively.

Qualified names occur in XML, whereas expanded names do not. Yet qualified names have no meaning on their own. They need to be resolved to expanded names, via the in-scope namespaces. Note that the term "qualified name" is often used for what yaidom (and the Namespaces specification) calls "expanded name", and that most XML APIs do not distinguish between the 2 kinds of names. Yaidom has to clearly make this distinction, in order to model namespaces correctly.

To resolve qualified names to expanded names, yaidom distinguishes between:

They are represented by immutable classes eu.cdevreeze.yaidom.Declarations and eu.cdevreeze.yaidom.Scope, respectively.

Namespace declarations occur in XML, whereas in-scope namespaces do not. The latter are the accumulated effect of the namespace declarations of the element itself, if any, and those in ancestor elements.

Note: in the code examples below, we assume the following import:

import eu.cdevreeze.yaidom._

To see the resolution of qualified names in action, consider the following sample XML:

<book:Bookstore xmlns:book="http://bookstore/book" xmlns:auth="http://bookstore/author">
  <book:Book ISBN="978-0321356680" Price="35" Edition="2">
    <book:Title>Effective Java (2nd Edition)</book:Title>
    <book:Authors>
      <auth:Author>
        <auth:First_Name>Joshua</auth:First_Name>
        <auth:Last_Name>Bloch</auth:Last_Name>
      </auth:Author>
    </book:Authors>
  </book:Book>
  <book:Book ISBN="978-0981531649" Price="35" Edition="2">
    <book:Title>Programming in Scala: A Comprehensive Step-by-Step Guide, 2nd Edition</book:Title>
    <book:Authors>
      <auth:Author>
        <auth:First_Name>Martin</auth:First_Name>
        <auth:Last_Name>Odersky</auth:Last_Name>
      </auth:Author>
      <auth:Author>
        <auth:First_Name>Lex</auth:First_Name>
        <auth:Last_Name>Spoon</auth:Last_Name>
      </auth:Author>
      <auth:Author>
        <auth:First_Name>Bill</auth:First_Name>
        <auth:Last_Name>Venners</auth:Last_Name>
      </auth:Author>
    </book:Authors>
  </book:Book>
</book:Bookstore>

Consider the last element with qualified name QName("book:Book"). To resolve this qualified name as expanded name, we need to know the namespaces in scope at that element. To compute the in-scope namespaces, we need to accumulate the namespace declarations of the last book:Book element and of its ancestor element(s), starting with the root element.

The start Scope is "parent scope" Scope.Empty. Then, in the root element we find namespace declarations:

Declarations.from("book" -> "http://bookstore/book", "auth" -> "http://bookstore/author")

This leads to the following namespaces in scope at the root element:

Scope.Empty.resolve(Declarations.from("book" -> "http://bookstore/book", "auth" -> "http://bookstore/author"))

which is equal to:

Scope.from("book" -> "http://bookstore/book", "auth" -> "http://bookstore/author")

We find no other namespace declarations in the last book:Book element or its ancestor(s), so the computed scope is also the scope of the last book:Book element.

Then QName("book:Book") is resolved as follows:

Scope.from("book" -> "http://bookstore/book", "auth" -> "http://bookstore/author").resolveQNameOption(QName("book:Book"))

which is equal to:

Some(EName("{http://bookstore/book}Book"))

This namespace support in yaidom has mathematical rigor. The immutable classes QName, EName, Declarations and Scope have precise definitions, reflected in their implementations, and they obey some interesting properties. For example, if we correctly define Scope operation relativize (along with resolve), we get:

scope1.resolve(scope1.relativize(scope2)) == scope2

This may not sound like much, but by getting the basics right, yaidom succeeds in offering first-class support for XML namespaces, without the magic and namespace-related bugs often found in other XML libraries.

There are 2 other basic concepts in this package, representing paths to elements:

They are represented by immutable classes eu.cdevreeze.yaidom.PathBuilder and eu.cdevreeze.yaidom.Path, respectively.

Path builders are like canonical XPath expressions, yet they do not contain the root element itself, and indexing starts with 0 instead of 1.

For example, the last name of the first author of the last book element has path:

Path.from(
EName("{http://bookstore/book}Book") -> 1,
EName("{http://bookstore/book}Authors") -> 0,
EName("{http://bookstore/author}Author") -> 0,
EName("{http://bookstore/author}Last_Name") -> 0
)

This path could be written as path builder as follows:

PathBuilder.from(QName("book:Book") -> 1, QName("book:Authors") -> 0, QName("auth:Author") -> 0, QName("auth:Last_Name") -> 0)

Using the Scope mentioned earlier, the latter path builder resolves to the path given before that, by invoking method PathBuilder.build(scope). In order for this to work, the Scope must be invertible. That is, there must be a one-to-one correspondence between prefixes ("" for the default namespace) and namespace URIs, because otherwise the index numbers may differ. Also note that the prefixes book and auth in the path builder are arbitrary, and need not match with the prefixes used in the XML tree itself.

Uniform query API traits

Yaidom provides a relatively small query API, to query an individual element for collections of child elements, descendant elements or descendant-or-self elements. The resulting collections are immutable Scala collections, that can further be manipulated using the Scala Collections API.

This query API is uniform, in that different element implementations share (most of) the same query API. It is also element-centric (unlike standard Scala XML and Anti-XML).

For example, consider the XML example given earlier, as a Scala XML literal named bookstore. We can wrap this Scala XML Elem into a yaidom wrapper of type eu.cdevreeze.yaidom.scalaxml.ScalaXmlElem, named bookstoreElem. Then we can query for all books, that is, all descendant-or-self elements with resolved (or expanded) name EName("{http://bookstore/book}Book"), as follows:

bookstoreElem filterElemsOrSelf (elem => elem.resolvedName == EName("{http://bookstore/book}Book"))

The result would be an immutable IndexedSeq of ScalaXmlElem instances, holding 2 book elements.

We could instead have written:

bookstoreElem.filterElemsOrSelf(EName("{http://bookstore/book}Book"))

with the same result.

Instead of searching for appropriate descendant-or-self elements, we could have searched for descendant elements only, without altering the result in this case:

bookstoreElem filterElems (elem => elem.resolvedName == EName("{http://bookstore/book}Book"))

or:

bookstoreElem.filterElems(EName("{http://bookstore/book}Book"))

We could even have searched for appropriate child elements only, without altering the result in this case:

bookstoreElem filterChildElems (elem => elem.resolvedName == EName("{http://bookstore/book}Book"))

or:

bookstoreElem.filterChildElems(EName("{http://bookstore/book}Book"))

or, knowing that all child elements are books:

bookstoreElem.findAllChildElems

We could find all authors of the Scala book as follows:

for {
  bookElem <- bookstoreElem filterChildElems (elem => elem.resolvedName == EName("{http://bookstore/book}Book"))
  if bookElem.attributeOption(EName("ISBN")) == Some("978-0981531649")
  authorElem <- bookElem filterElems (elem => elem.resolvedName == EName("{http://bookstore/author}Author"))
} yield authorElem

or:

for {
  bookElem <- bookstoreElem.filterChildElems(EName("{http://bookstore/book}Book"))
  if bookElem.attributeOption(EName("ISBN")) == Some("978-0981531649")
  authorElem <- bookElem.filterElems(EName("{http://bookstore/author}Author"))
} yield authorElem

We could even use operator notation, as follows:

for {
  bookElem <- bookstoreElem \ (elem => elem.resolvedName == EName("{http://bookstore/book}Book"))
  if (bookElem \@ EName("ISBN")) == Some("978-0981531649")
  authorElem <- bookElem \\ (elem => elem.resolvedName == EName("{http://bookstore/author}Author"))
} yield authorElem

or:

for {
  bookElem <- bookstoreElem \ EName("{http://bookstore/book}Book")
  if (bookElem \@ EName("ISBN")) == Some("978-0981531649")
  authorElem <- bookElem \\ EName("{http://bookstore/author}Author")
} yield authorElem

where \\ stands for filterElemsOrSelf.

Now suppose the same XML is stored in a (org.w3c.dom) DOM tree, wrapped in a eu.cdevreeze.yaidom.dom.DomElem bookstoreElem. Then the same queries would use exactly the same code as above! The result would be a collection of DomElem instances instead of ScalaXmlElem instances, however. There are many more element implementations in yaidom, and they share (most of) the same query API. Therefore this query API is called a uniform query API.

The last example, using operator notation, looks a bit more "XPath-like". It is more verbose than queries in Scala XML, however, partly because in yaidom these operators cannot be chained. Yet this is with good reason. Yaidom does not blur the distinction between elements and element collections, and therefore does not offer any XPath experience. The small price paid in verbosity is made up for by precision. The yaidom query API traits have very precise definitions of their operations, as can be seen in the corresponding documentation.

The uniform query API traits turn minimal APIs into richer APIs, where each richer API is defined very precisely in terms of the minimal API. The top-level query API trait is eu.cdevreeze.yaidom.ParentElemLike. It needs to be given a method implementation to query for child elements (not child nodes in general, but just child elements!), and it offers methods to query for some or all child elements, descendant elements, and descendant-or-self elements. That is, the minimal API consists of abstract method findAllChildElems, and it offers methods such as filterChildElems, filterElems and filterElemsOrSelf. This trait has no knowledge about elements at all, other than the fact that elements can have child elements.

Sub-trait eu.cdevreeze.yaidom.ElemLike adds minimal knowledge about elements themselves, viz. that elements have a "resolved" (or expanded) name, and "resolved" attributes (mapping attribute expanded names to attribute values). That is, it needs to be given implementations of abstract methods resolvedName and resolvedAttributes, and then offers methods to query for attributes or child/descendant/descendant-or-self elements with a given expanded name. The trait is trivially defined in terms of its super-trait.

It is important to note that yaidom does not consider namespace declarations to be attributes themselves. Otherwise, there would have been circular dependencies between both concepts, because attributes with namespaces require in-scope namespaces and therefore namespace declarations for resolving the names of these attributes.

Note that traits eu.cdevreeze.yaidom.ElemLike and eu.cdevreeze.yaidom.ParentElemLike only know about elements, not about other kinds of nodes. Of course the actual element implementations mixing in this query API know about other node types, but that knowledge is outside the uniform query API. Note that the example queries above only use the minimal element knowledge that traits ElemLike and ParentElemLike have about elements. Therefore the query code can be used unchanged for different element implementations.

The ElemLike trait has sub-trait eu.cdevreeze.yaidom.PathAwareElemLike. It adds knowledge about paths. Paths can be queried (in the same way that elements can be queried in trait ParentElemLike), and elements can be found given a path.

For example, to query for the Scala book authors, the following alternative code can be used (if the used element implementation mixes in trait PathAwareElemLike, which is not the case for the Scala XML and DOM wrappers above):

for {
  authorPath <- bookstoreElem filterElemOrSelfPaths (elem => elem.resolvedName == EName("{http://bookstore/author}Author"))
  if authorPath.entries.contains(Path.Entry(EName("{http://bookstore/book}Book"), 1))
} yield bookstoreElem.getElemOrSelfByPath(authorPath)

The PathAwareElemLike trait has sub-trait eu.cdevreeze.yaidom.UpdatableElemLike. This trait offers functional updates at given paths. Whereas the super-traits know only about elements, this trait knows that elements have some node super-type.

Instead of functional updates at given paths, elements can also be "transformed" functionally without specifying any paths. This is offered by trait eu.cdevreeze.yaidom.TransformableElemLike, which unlike the traits above has no super-traits. The Scala XML and DOM wrappers above do not mix in this trait.

Some element implementations

The uniform query API traits, especially ParentElemLike and its sub-trait ElemLike are mixed in by many element implementations. In this package there are 2 immutable element implementations, eu.cdevreeze.yaidom.ElemBuilder and eu.cdevreeze.yaidom.Elem.

Class eu.cdevreeze.yaidom.Elem is the default element implementation of yaidom. It extends class eu.cdevreeze.yaidom.Node. The latter also has sub-classes for text nodes, comments, entity references and processing instructions. Class eu.cdevreeze.yaidom.Document contains a document Elem, but is not a Node sub-class itself.

The eu.cdevreeze.yaidom.Elem class has the following characteristics:

Creating such Elem trees by hand is a bit cumbersome, partly because scopes have to be passed to each Elem in the tree. The latter is not needed if we use class eu.cdevreeze.yaidom.ElemBuilder to create element trees by hand. When the tree has been fully created as ElemBuilder, invoke method ElemBuilder.build(parentScope) to turn it into an Elem.

Like their super-classes Node and NodeBuilder, classes Elem and ElemBuilder have very much in common. Both are immutable, easy to compose (ElemBuilder instances even more so), equality is reference equality, etc. The most important differences are as follows:

The Effective Java book element in the XML example above could have been written as ElemBuilder (without the inter-element whitespace) as follows:

import NodeBuilder._

elem(
  qname = QName("book:Book"),
  attributes = Vector(QName("ISBN") -> "978-0321356680", QName("Price") -> "35", QName("Edition") -> "2"),
  children = Vector(
    elem(
      qname = QName("book:Title"),
      children = Vector(
        text("Effective Java (2nd Edition)")
      )
    ),
    elem(
      qname = QName("book:Authors"),
      children = Vector(
        elem(
          qname = QName("auth:Author"),
          children = Vector(
            elem(
              qname = QName("auth:First_Name"),
              children = Vector(
                text("Joshua")
              )
            ),
            elem(
              qname = QName("auth:Last_Name"),
              children = Vector(
                text("Bloch")
              )
            )
          )
        )
      )
    )
  )
)

This ElemBuilder (say, eb) lacks namespace declarations for prefixes book and auth. So, the following returns false:

eb.canBuild(Scope.Empty)

while the following returns true:

eb.canBuild(Scope.from("book" -> "http://bookstore/book", "auth" -> "http://bookstore/author"))

Indeed,

eb.build(Scope.from("book" -> "http://bookstore/book", "auth" -> "http://bookstore/author"))

returns the element tree as Elem.

Note that the distinction between ElemBuilder and Elem "solves" the mismatch that immutable ("functional") element trees are constructed in a bottom-up manner, while namespace scoping works in a top-down manner. (See also Anti-XML issue 78, in https://github.com/djspiewak/anti-xml/issues/78).

There are many more element implementations in yaidom, most of them in sub-packages of this package. Yaidom is extensible in that new element implementations can be invented, for example elements that are better "roundtrippable" (at the expense of "composability"), or yaidom wrappers around other DOM-like APIs (such as XOM or JDOM2). The current element implementations in yaidom are:

This illustrates that especially trait ElemLike is a uniform query API in yaidom.

Packages and dependencies

Yaidom has the following packages, and layering between packages:

Indeed, all yaidom package dependencies are uni-directional.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. yaidom
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Type Members

  1. final case class Comment(text: String) extends Node with Product with Serializable

    Annotations
    @SerialVersionUID( 1L )
  2. final case class CommentBuilder(text: String) extends NodeBuilder with Product with Serializable

    Annotations
    @SerialVersionUID( 1L )
  3. trait ConverterToDocument[A] extends AnyRef

    Converter from A (which can be anything) to eu.cdevreeze.yaidom.Document.

  4. trait ConverterToElem[A] extends AnyRef

    Converter from A (which can be anything) to eu.cdevreeze.yaidom.Elem.

  5. final case class Declarations(prefixNamespaceMap: Map[String, String]) extends Immutable with Product with Serializable

    Namespace declarations (and undeclarations), typically at the level of one element.

  6. final class DocBuilder extends DocumentApi[ElemBuilder] with Immutable with Serializable

    Builder of a yaidom Document.

  7. final class Document extends DocumentApi[Elem] with Immutable with Serializable

    Document.

  8. trait DocumentApi[E <: ParentElemApi[E]] extends AnyRef

    Minimal API for Documents, having a type parameter for the element type.

  9. trait DocumentConverter[A] extends AnyRef

    Converter from eu.cdevreeze.yaidom.Document to A (which can be anything, such as a DOM Document).

  10. final case class EName(namespaceUriOption: Option[String], localPart: String) extends Immutable with Product with Serializable

    Expanded name.

  11. final class Elem extends Node with UpdatableElemLike[Node, Elem] with TransformableElemLike[Node, Elem] with HasText

    Immutable, thread-safe element node.

  12. trait ElemApi[E <: ElemApi[E]] extends ParentElemApi[E]

    This is the best known part of the yaidom uniform query API.

  13. final class ElemBuilder extends NodeBuilder with ParentElemLike[ElemBuilder] with TransformableElemLike[NodeBuilder, ElemBuilder]

    Builder for elements.

  14. trait ElemConverter[A] extends AnyRef

    Converter from eu.cdevreeze.yaidom.Elem to A (which can be anything, such as a DOM Element).

  15. trait ElemLike[E <: ElemLike[E]] extends ParentElemLike[E] with ElemApi[E]

    API and implementation trait for elements as containers of elements, each having a name and possible attributes.

  16. final case class EntityRef(entity: String) extends Node with Product with Serializable

    An entity reference.

  17. final case class EntityRefBuilder(entity: String) extends NodeBuilder with Product with Serializable

    Annotations
    @SerialVersionUID( 1L )
  18. trait HasParent[E <: HasParent[E]] extends AnyRef

    API and implementation trait for elements that can be asked for the ancestor elements, if any.

  19. trait HasText extends AnyRef

    Trait defining the contract for elements as text containers.

  20. sealed trait Node extends Immutable with Serializable

    Immutable XML Node.

  21. sealed trait NodeBuilder extends Immutable with Serializable

    DSL to build Elems (or Documents) without having to pass parent Scopes around.

  22. trait ParentElemApi[E <: ParentElemApi[E]] extends AnyRef

    This is the foundation of the yaidom uniform query API.

  23. trait ParentElemLike[E <: ParentElemLike[E]] extends ParentElemApi[E]

    API and implementation trait for elements as containers of elements, as element nodes in a node tree.

  24. final class Path extends Immutable

    Unique identification of a descendant (or self) Elem given a root Elem.

  25. trait PathAwareElemApi[E <: PathAwareElemApi[E]] extends ElemApi[E]

    This is the Path-aware part of the yaidom uniform query API.

  26. trait PathAwareElemLike[E <: PathAwareElemLike[E]] extends ElemLike[E] with PathAwareElemApi[E]

    API and implementation trait for elements as containers of elements, each having a name and possible attributes, as well as having awareness of paths.

  27. final class PathBuilder extends Immutable

    Builder for Path instances.

  28. final case class PrefixedName(prefix: String, localPart: String) extends QName with Product with Serializable

  29. final case class ProcessingInstruction(target: String, data: String) extends Node with Product with Serializable

    Annotations
    @SerialVersionUID( 1L )
  30. final case class ProcessingInstructionBuilder(target: String, data: String) extends NodeBuilder with Product with Serializable

    Annotations
    @SerialVersionUID( 1L )
  31. sealed trait QName extends Immutable with Serializable

    Qualified name.

  32. final case class Scope(prefixNamespaceMap: Map[String, String]) extends Immutable with Product with Serializable

    Scope mapping prefixes to namespace URIs, as well as holding an optional default namespace.

  33. final case class Text(text: String, isCData: Boolean) extends Node with Product with Serializable

    Annotations
    @SerialVersionUID( 1L )
  34. final case class TextBuilder(text: String, isCData: Boolean) extends NodeBuilder with Product with Serializable

    Annotations
    @SerialVersionUID( 1L )
  35. trait TransformableElemApi[N, E <: N with TransformableElemApi[N, E]] extends AnyRef

    This is the element transformation part of the yaidom query and update API.

  36. trait TransformableElemLike[N, E <: N with TransformableElemLike[N, E]] extends TransformableElemApi[N, E]

    API and implementation trait for transformable elements.

  37. final case class UnprefixedName(localPart: String) extends QName with Product with Serializable

  38. trait UpdatableElemApi[N, E <: N with UpdatableElemApi[N, E]] extends PathAwareElemApi[E]

    This is the functional update part of the yaidom uniform query API.

  39. trait UpdatableElemLike[N, E <: N with UpdatableElemLike[N, E]] extends PathAwareElemLike[E] with UpdatableElemApi[N, E]

    API and implementation trait for functionally updatable elements.

  40. type ElemPath = Path

    Annotations
    @deprecated
    Deprecated

    (Since version 0.7.1) Use Path instead

  41. type ElemPathBuilder = PathBuilder

    Annotations
    @deprecated
    Deprecated

    (Since version 0.7.1) Use PathBuilder instead

Value Members

  1. object Declarations extends Serializable

  2. object DocBuilder extends Serializable

  3. object Document extends Serializable

  4. object EName extends Serializable

  5. object Elem extends Serializable

  6. object ElemApi

    This companion object offers some convenience factory methods for "element predicates", that can be used in yaidom queries.

  7. object Node extends Serializable

    This singleton object contains a DSL to easily create deeply nested Elems.

  8. object NodeBuilder extends Serializable

  9. object Path

  10. object PathBuilder

  11. object QName extends Serializable

  12. object Scope extends Serializable

  13. object TreeReprParsers extends JavaTokenParsers

    Generator for parsers of "tree representation" expressions.

  14. package convert

    Support for conversions from/to yaidom.

  15. package docaware

    This package contains element representations that contain the "context" of the element, including the URI of the containing document.

  16. package dom

    Wrapper around class org.w3c.dom.Element, adapting it to the eu.cdevreeze.yaidom.ElemLike API.

  17. package indexed

    This package contains element representations that contain the "context" of the element.

  18. package parse

    Support for parsing XML into yaidom Documents and Elems.

  19. package print

    Support for "printing" yaidom Documents and Elems.

  20. package resolved

    This package contains element representations that can be compared for (some notion of "value") equality, unlike normal yaidom nodes.

  21. package scalaxml

    Wrapper around class scala.xml.Elem, adapting it to the eu.cdevreeze.yaidom.ElemLike API.

Deprecated Value Members

  1. val ElemPath: Path.type

    Annotations
    @deprecated
    Deprecated

    (Since version 0.7.1) Use Path instead

  2. val ElemPathBuilder: PathBuilder.type

    Annotations
    @deprecated
    Deprecated

    (Since version 0.7.1) Use PathBuilder instead

Inherited from AnyRef

Inherited from Any

Ungrouped