A special implementation of scala.collection.Map[
com.codecommit.antixml.QName, String]
with
nice overloading and some implicit magic designed for use containing element
attributes in com.codecommit.antixml.Elem.
A special implementation of scala.collection.Map[
com.codecommit.antixml.QName, String]
with
nice overloading and some implicit magic designed for use containing element
attributes in com.codecommit.antixml.Elem. The actual stored keys are of
type com.codecommit.antixml.QName. This is how (optional) namespace
information for attributes is stored in Anti-XML trees. However, there are
some syntactic tricks which allow you to ignore
the com.codecommit.antixml.QName boiler-plate when you don't actually need
namespace support. For example:
val attrs = Attributes("foo" -> "bar", "baz" -> "bin") attrs("foo") // => "bar" attrs(QName(None, "foo")) // => "bar" val attrs2 = attrs + ("even" -> "more") // => Attributes(...) val attrs3 = attrs + (QName(Some("pre"), "even" -> "less") // => Attributes(...)
With very, very few exceptions, String
and com.codecommit.antixml.QName are interchangable.
Of course, this is being done with implicit conversions. However, you don't
need to worry about the conversion String => QName
poluting the implicit
dispatch space! The conversion is defined on the companion object
for com.codecommit.antixml.QName, meaning that the compiler will only
select it when the result of an expression is explicitly of type QName
.
It will not automatically inject the conversion to satisfy method dispatch on
String
. For example:
val str = "fubar" val qn: QName = str // works! str.ns // won't compile!
In this example, it is important to note that ns
is a method
on com.codecommit.antixml.QName. Thus, if the implicit conversion were
pimp-enabling, the compiler would have accepted the last line of the example.
However, as you can see, the implicit dispatch space has not been cluttered
while the convenience of String
rather than com.codecommit.antixml.QName has
been preserved.
One implicit space-cluttering that couldn't be avoided is the conversion
defined as
. This is required to enable
the nice (String, String) => (QName, String)
String
syntax on things like the +
method and the companion
object apply
factory. Unfortunately, this conversion had to be defined in
the com.codecommit.antixml companion object. Fortunately, it is a conversion
within the same type (simply different parameters passed to Tuple2
). Thus,
it shouldn't cause any scoping problems.
A node containing a single string, representing unescaped character data in the XML tree.
A node containing a single string, representing unescaped character data in the XML tree. For example:
<![CDATA[Lorem ipsum & dolor sit amet]]>
This would result in the following node:
CDATA("Lorem ipsum & dolor sit amet")
Note that reserved characters (as defined by the XML 1.0 spec) are not
escaped when calling toString
. If you need a text representation which
performs escaping, use com.codecommit.antixml.Text
A factory for com.codecommit.antixml.Zipper instances.
A factory for com.codecommit.antixml.Zipper instances.
WARNING: This is a "low-level" trait that was primarily designed for internal
use of the antixml package. It is tied to the Zipper
implementation and
could change significantly in a future release.
This trait is similar to CanBuildFrom, except that it allows a zipper context to be specified in addition to the usual sequence of items. See the com.codecommit.antixml.Zipper trait for the definition of "zipper context".
The Builder produced by this class accepts objects of type com.codecommit.antixml.CanBuildFromWithZipper.ElemsWithContext. These objects contain the following information:
Note that an ElemsWithContext
may contain an empty sequence, in which case its path (hole)
is added to zipper context without being associated to any items. Also note that it is legal for
the same path (hole) to be added to the Builder multiple times. The resulting Zipper
will associate all of the corresponding items to that hole.
The parent of the zipper context is specified to the apply
method of this trait.
The type of collection that is producing the zipper.
The type of nodes to be contained in the result (if any).
the type of collection being produced.
A marker interface for scala.collection.mutable.CanBuildFrom instances that can be lifted into com.codecommit.antixml.CanBuildFromWithZipper instances that operate on com.codecommit.antixml.Node types.
Pimp container for the explicit conversions into Anti-XML types.
Pimp container for the explicit conversions into Anti-XML types. Out of the
box, conversions are provided from scala.xml
types. However, this mechanism
is very extensible due to the use of a typeclass (com.codecommit.antixml.XMLConvertable)
to represent the actual conversion. Thus, it is possible to add conversions
by defining an implicit instance of the typeclass and having it in scope. It
is even possible to override the built-in conversions for scala.xml
types
simply by shadowing the conversions for types like scala.xml.Elem. The
built-in conversions are defined in such a way that Scala's implicit resolution
will give precedence to almost anything you define, as long as it is somehow
in scope.
An XML element consisting of an optional namespace prefix, a name (or identifier), a set of attributes, a namespace prefix scope (mapping of prefixes to namespace URIs), and a sequence of child nodes.
An XML element consisting of an optional namespace prefix, a name (or identifier), a set of attributes, a namespace prefix scope (mapping of prefixes to namespace URIs), and a sequence of child nodes. For example:
<span id="foo" class="bar">Lorem ipsum</span>
This would result in the following node:
Elem(None, "span", attrs = Attributes("id" -> "foo", "class" -> "bar"), children = Group(Text("Lorem ipsum")))
TODO: Consider making Elem not a case class and handle thing a different way.
A node representing an entity reference.
A node representing an entity reference. For example:
…
This would result in the following node:
EntityRef("hellip")
Represents a collection of arbitrary nodes (com.codecommit.antixml.Node)).
Represents a collection of arbitrary nodes (com.codecommit.antixml.Node)).
Note that this collection need not have a single root parent element. Thus,
a valid Group
could be as follows:
Group(EntityRef("quot"), Text("Daniel is "), Elem(None, "em", Attributes(), Map(), Group(Text("delusional!"))), EntityRef("quot"))
This would correspond to the following XML fragment (note: not actually well-formed XML, since it is lacking a single root element):
"Daniel is <em>delusional!</em>"
Note that unlike scala.xml
, Group
is not a special type of com.codecommit.antixml.Node!
This design decision has a very profound impact on the framework as a whole.
In general, the result is an API which is more consistent and more predictable
than it otherwise would have been. However, it also resulted in some unfortunate
sacrifices: specifically, full XPath semantics. The exact semantics of the
\
and \\
operators are defined in their respective scaladocs.
Group
is parameterized based on the type of Node
it contains. In the
general case (such as the one illustrated above), this will be exactly Node
.
However, there are some very common cases wherein a Group
may have a more
specific type than just Node
. For example:
val ns: Group[Node] = ... val results = ns \ "name"
In this example, results
will have type Group[Elem]
. This is because the
selector employed ("name"
) can only produce results of type Elem
. This
mechanism forms the basis for the typed selectors mechanism, which is extremely
powerful and serves to eliminate a great deal of boiler-plate casting when
traversing XML hierarchies.
In the general case, Group
is backed by an instance of scala.collection.immutable.Vector.
This implementation detail is significant as it implies two things. First,
the implementation of Group
is truly immutable, meaning that there are no
tricky concurrency semantics to worry about. Second, unlike scala.xml
(which
backs its sequences by either List
or ArrayBuffer
, depending on phase of
the moon), it is possible to perform efficient random-access and updates
across the entire Group
. Random access is implemented by the apply
method,
while random "updates" are implemented by the updated
method. Fast prepend
and append operations are also available.
Beyond this, all standard collection operations are available on Group
(e.g.
flatMap
, exists
, collect
, slice
, etc). The appropriate incantations
have been spoken to allow these methods to return the correct type. Thus, if
you map
over a Group
and your function returns something which extends
Node
, the result will be a Group
. If your function returns something which
does not extend Node
(e.g. Int
), then the result will be something
else (probably a generic IndexedSeq
backed by Vector
). Group
itself
extends scala.collection.immutable.IndexedSeq and thus can be used in
situations which require this abstraction.
Allow these to be mixed in where needed, instead of having to import the package object.
Root of the Node
ADT, representing the different types of supported XML
nodes which may appear in an XML fragment.
Root of the Node
ADT, representing the different types of supported XML
nodes which may appear in an XML fragment. The ADT itself has the following
shape (Haskell syntax):
type Prefix = Maybe String type Scope = Map String String data Node = ProcInstr String String | Elem Prefix String Attributes Scope (Group Node) | Text String | CDATA String | EntityRef String
For those that don't find Haskell to be the clearest explanation of what's
going on in this type, here is a more natural-language version. The Node
trait is sealed and has exactly four subclasses, each implementing a different
type of XML node. These four classes are as follows:
&
)Defines a SAX2 handler which produces an instance
of com.codecommit.antixml.Group[
com.codecommit.antixml.Elem]
as
a result.
Defines a SAX2 handler which produces an instance
of com.codecommit.antixml.Group[
com.codecommit.antixml.Elem]
as
a result. This is the handler which is used internally by com.codecommit.antixml.SAXParser.
It is provided as part of the public API to allow Anti-XML to be used with
alternative SAX2 event sources (such as HTML parsers like TagSoup). The
resulting com.codecommit.antixml.Group is obtained (at the conclusion of
the parse) from the result()
method.
A processing instruction consisting of a target
and some data
.
A processing instruction consisting of a target
and some data
. For example:
<?xml version="1.0"?>
This would result in the following node:
ProcInstr("xml", "version=\"1.0\"")
An XML parser build on top of org.w3c.sax
.
An XML parser build on top of org.w3c.sax
. This implements the same
API as com.codecommit.antixml.StAXParser, but the runtime performance is
on the order of 13% slower. The SAX2 event handler used under the surface is
part of the public API in the form of com.codecommit.antixml.NodeSeqSAXHandler.
An XML parser build on top of javax.xml.stream
.
An XML parser build on top of javax.xml.stream
. This implements the same
API as com.codecommit.antixml.SAXParser, but the runtime performance is
on the order of 12% faster.
A node containing a single string, representing character data in the XML tree.
A node containing a single string, representing character data in the XML tree. For example:
Lorem ipsum & dolor sit amet
This would result in the following node:
Text("Lorem ipsum & dolor sit amet")
Note that reserved characters (as defined by the XML 1.0 spec) are escaped
when calling toString
. Thus, if you invoke toString
on the Text
node
given in the example above, the result will reverse back into the original
text, including the &
escape. If you need a text representation which
does not escape characters on output, use com.codecommit.antixml.CDATA.
Typeclass definition for conversions used by the com.codecommit.antixml.Converter pimp.
Typeclass definition for conversions used by the com.codecommit.antixml.Converter pimp.
Note that this type is exactly isomorphic to scala.Function1, right
down to the method name (apply
). Normally, such a class would in fact extend
A => B
, rather than simply emulating its interface. However, because most
instances of XMLConvertable
will be implicit, we cannot blithely extend
Function1
. To do so would polute the scope with an unexpected proliferation
of implicit conversions which would be automatically injected by the Scala
compiler, rather than allowing us to tag them explicitly using the convert
method.
A trait for objects which construct antixml from XML sources.
Provides an unselect
operation which copies this Group's nodes back to the XML tree from which
it was derived.See the Anti-XML Overview for a
high-level description of this functionality.
Provides an unselect
operation which copies this Group's nodes back to the XML tree from which
it was derived.See the Anti-XML Overview for a
high-level description of this functionality.
The Zipper
trait augments a com.codecommit.antixml.Group with additional immutable state used
to support the unselect
method. This state is known as the "zipper context" and is defined by:
Group
, known as the parent of the Zipper.Loosely speaking, the unselect
method produces an updated version of the
parent by replacing its holes with the nodes from the Zipper, as determined by the replacement map.
A formal definition of unselect
can be found below.
Certain "modify" operations on a Zipper
will propagate the zipper context to the result.
The new Zipper's unselect
method can then be viewed as applying
these modifications back to the parent tree. Currently, the following methods
support this propagation of the zipper context:
updated
, map
, flatMap
, filter
, collect
, slice
, drop
, take
, splitAt
, and
unselect
(the latter viewed as a modification of the parent Zipper
).These operations all provide a natural identification of indices in the new Zipper with the indices they were derived from in the original. This identification is used to lift the replacement map to the new Zipper. The parent and holes of the new Zipper are always the same as those of the original.
Of course, propagation is only possible if the result can legally be a Zipper
. Replacing a Node
with a String
, for example, will result in an undecorated IndexedSeq
because the result violates
Zipper's type bounds.
A Zipper's replacement map need neither be injective nor surjective.
Injectivity can fail due to the action of flatMap
, which replaces a node with a sequence of nodes,
all of which are associated with the original node's hole. In such cases, unselect
will replace the hole with the
entire sequence of nodes mapping to it. Surjectivity can fail due to any operation that "removes" items
from the Zipper
. If a hole is not associated with any Zipper nodes, then unselect
will remove that position
from the resulting tree.
For a given Zipper, a hole, H, is said to be conflicted if the Zipper contains another hole,
Hc , contained in the subtree at H. In this case, the Zipper is said to be
conflicted at H. A Zipper that does not contain conflicted holes is said to be conflict free.
Conflicted holes arise when a selection operator yields both a node and one or more of its descendants.
They are of concern because there is no canonical way to specify the behavior of unselect
at a
conflicted hole. Instead, a com.codecommit.antixml.ZipperMergeStrategy, implicitly provided
to unselect
, is used to resolve the conflict.
A default ZipperMergeStrategy has been provided that should suffice for the most common use cases involving
conflicted holes. In particular, if modifications to a conflicted element are limited to its top-level properties
(name
, attributes
, etc.), then the default strategy will apply those changes while preserving any modifications
made to those descendant nodes also present in the Zipper. However, if the children
property of a conflicted element
is directly modified, then the default strategy's behavior is formally unspecified. Currently it uses a heuristic
algorithm to resolve conflicts, but its details may change in a future release.
Of the com.codecommit.antixml.Selectable operators, only \\
is capable of producing conflicts.
The select
, \
, and \\!
operators always produce conflict-free Zippers.
Let G be a group, and Z be a zipper with G as its parent. For each location, L, in G (top-level or otherwise), we make the following definitions:
children
(L) is the sequence of locations that are immediately below L in G.children
(L).flatMap(pullback)
.pullback
(L) is the sequence of nodes given by the following recursive definition:pullback
(L) is the singleton sequence consisting of the indirect update for Lpullback
(L) is the direct updates for L.pullback
(L) is the result of merging its direct updates and its
indirect update according to the com.codecommit.antixml.ZipperMergeStrategy provided to unselect
.Let T be the sequence of top-level locations in G. Then Z.unselect
is defined as T.flatMap(pullback)
.
Describes the parameters of a merge operation.
Describes the parameters of a merge operation. See the com.codecommit.antixml.Zipper trait for formal definitions of these parameters.
Note that a merge operation always occurs at a particular conflicted hole (location) within the parent XML tree. All of the ZipperMergeContext attributes are considered to be "located" at that hole.
the original Node that was present at the conflicted hole.
the direct updates for the hole and their corresponding update times. These are the nodes
that explicitly replaced original
via modifications made to the Zipper.
the largest update time of any direct update to the hole. If directUpdates
is empty, this
will indicate the time that the node was removed.
the indirect update for the hole and its associated update time. This node's descendants contains
the results of all the updates made to the descendant holes causing the conflict. It's top-level attributes are the
same as those of original
Defines the merge function used to resolve the behavior of Zipper.unselect
at conflicted holes..
Defines the merge function used to resolve the behavior of Zipper.unselect
at conflicted holes..
See com.codecommit.antixml.Zipper for more details.
The companion object contains some predefined strategies, including the default implicit strategy,
PreferLatest
.
Wildcard selector which passes all nodes unmodified.
Wildcard selector which passes all nodes unmodified. This is analogous
to the "_"
selector syntax in scala.xml
. For example: ns \ * \ "name"
Factory companion for the com.codecommit.antixml.Attributes specialized
Map
.
Factory companion for the com.codecommit.antixml.Attributes specialized
Map
. The only method of serious interest in this object is the apply
method,
which works exactly the same as the apply
method on any Map
companion
object.
Different implicit implementations of com.codecommit.antixml.CanBuildFromWithZipper.
Factory singleton for Group
.
Factory singleton for Group
. This object is primarily used for creating
new Group
(s) from specified nodes.
The default XML parser instance for the Anti-XML framework.
The default XML parser instance for the Anti-XML framework. This is really just a convenience instance of com.codecommit.antixml.XMLParser. The default parser (currently) uses the Java StAX framework under the surface, though the parser interface is also 100% compatible with the SAX2 framework (see: com.codecommit.antixml.SAXParser). The StAX implementation is the default primarily for performance reasons.
It is possible to reuse some of Anti-XML's internal parser infrastructure to parse into Anti-XML trees from alternative parse frameworks, such as HTML parsers (think: TagSoup). This infrastructure is exposed via the com.codecommit.antixml.NodeSeqSAXHandler class. Unlike scala.xml, Anti-XML does not allow extension of its com.codecommit.antixml.Node construction process. Thus, it is not possible to define (or directly parse into) custom com.codecommit.antixml.Node instances. This capability wouldn't make much sense though, since com.codecommit.antixml.Node is sealed. It is not possible to even define custom instances, much less produce them as part of the parse process.
Contains the built-in explicit conversions into Anti-XML.
Contains the built-in explicit conversions into Anti-XML. Currently, these
conversions only cover types in scala.xml
. This may be expanded in future.
All of the members in this object are implicit, and thus it is rare for a user to need to access them directly. The membership is contrived in such a way that the implicit resolution will use the following precedence order:
ElemConvertable
TextConvertable
EntityRefConvertable
NodeConvertable
NodeSeqConvertable
This corresponds with the roughly-intuitive conversion precedence. Thus, if
we have a value of type scala.xml.Elem and we invoke the convert
method on
that value, the result will be of type com.codecommit.antixml.Elem. However,
if we take that same value and ascribe it the type of scala.xml.Node,
the convert
method will return a value of type com.codecommit.antixml.Node.
Finally, we can take this same value and ascribe it the even less-specific type
of scala.xml.NodeSeq (or even scala.Seq[
scala.xml.Node]
, for
that matter). Invoking the convert
method on this maximally-widened type will
produce a value of type com.codecommit.antixml.Group[
com.codecommit.antixml.Node]
.
Thus, the most specific conversion is chosen in all cases.
Pimps the convert
method onto any object for which there exists a conversion
into Anti-XML.
Pimps the convert
method onto any object for which there exists a conversion
into Anti-XML. Note that this conversion is an implicit value, statically
enforced and thus shouldn't be the source of any collision issues. It should
actually be possible to have another implicit conversion in scope which
pimps the convert
method without seeing conflicts.
Non-node selector which finds exclusively com.codecommit.antixml.Text
nodes and pulls out their String
content.
Non-node selector which finds exclusively com.codecommit.antixml.Text
nodes and pulls out their String
content. Unlike most selectors, the
result of using this selector is not a com.codecommit.antixml.Group, but
a generic scala.collection.Traversable[String]
. This selector can
be used to emulate the NodeSeq#text
method provided by scala.xml
. For
example: ns \\ text mkString
(this is analogous, but not quite equivalent
to calling ns.text
in scala.xml
).
Base package for the Anti-XML framework. Note that importing this package brings in a number of implicit conversions. Specifically:
A
.A => Converter[A]
– Implements explicit conversions fromscala.xml
types to Anti-XML correspondents (where applicable). This technically makes theconvert
method available on all types. However, that method will only be callable on very specific types in thescala.xml
library, and thus it shouldn't cause any collsion issues.(String, String) => (QName, String)
– Required to get nice syntax for unqualified attribute names. Note there is an additional conversion of typeString => QName
, but that conversion is defined on the companion object for com.codecommit.antixml.QName, which prevents it from cluttering the dispatch implicit space (i.e. it only applies as a type coercion, not a pimp).