Chapter 20 of the XML Bible, Second Edition : XPointers

In This Chapter

XPointer, the XML Pointer Language, defines an addressing scheme for individual parts of an XML document. These addresses can be used by any application that needs to identify parts of or locations in an XML document. For instance, an XML editor could use an XPointer to identify the current position of the insertion point or the range of the selection. An XInclude processor can use an XPointer to determine what part of a document to include. And the URI in an XLink can include an XPointer fragment identifier that locates one particular element in the targeted document. XPointers use the same XPath syntax that you're familiar with from XSL transformations to identify the parts of the document they point to, along with a few additional pieces.


This chapter is based on the January 8, 2001, XPointer Last Call Working Draft, the November 16, 1999, XPath 1.0 specification, and the December 20, 2000, XLink Proposed Recommendation. The broad picture presented here is likely to be correct, but the details are subject to change. You can find the latest XPointer specification at Furthermore, no mainstream browsers have any support for XPointers. You can use URLs with XPointer fragment identifiers in Web pages, but browsers will mostly ignore them.

Why Use XPointers?

Traditional URLs are simple and easy to use, but they're also quite limited. For one thing, a URL only points at a single, complete document. More granularity than that, such as linking to the third sentence of the seventeenth paragraph in a document, requires the author of the targeted document to manually insert named anchors at the targeted location. The author of the document doing the linking can't do this unless he or she also has write access to the document being linked to. Even if the author doing the linking can insert named anchors into the targeted document, it's almost always inconvenient.

It would be more useful to be able to link to a particular element or group of elements on a page without having to change the document you're linking to. For example, given a large document such as the complete baseball statistics of Chapters 4 and 5, you might want to link to only one team or one player. There are several parts to this problem. The first part is addressing the individual elements. This is the part that XPointers solve. XPointers enable you to target a given element by number, name, type, or relation, to other elements in the document.

The second part of the problem is the protocol by which a browser asks a Web server to send only part of a document rather than the whole thing. This is an area of active research. More work is needed. XPointers do little to solve this problem, except for providing a foundation on which such systems can build. For instance, the best efforts to date are the so-called "byte range extensions to HTTP" available in HTTP 1.1. So far these have not achieved widespread adoption, mostly because Web authors aren't comfortable specifying a byte range in a document. Furthermore, byte ranges are extremely fragile. Trivial edits to a document, even simple reformatting, can destroy byte range links. HTTP 1.1 does allow other range units besides raw bytes (for example, XML elements), but does not require Web servers or browsers to support such units. Much work remains to be done.

The third part of the problem is making sure that the retrieved document makes sense without the rest of the document to go along with it. In the context of XML, this effectively means the linked part is well formed, or perhaps valid. This is a tricky proposition, because most XML documents, especially ones with nontrivial prologs, don't decompose well. Again, XPointers don't address this. The World Wide Web Consortium (W3C) XML Fragment Working Group is addressing this issue, but work here is far from finished.

For the moment, therefore, an XPointer can be used as an index into a complete document, the whole of which is loaded and then positioned at the location identified by the XPointer, and even this much is more than most browsers can handle. In the long-term, extensions to XML, XLink, HTTP, and other protocols may allow more sophisticated uses of XPointers. For instance, XInclude will let you quote a remote document by using an XPointer to tell browsers where to copy the quote in the original document, rather than retyping the text of the quote. You could include cross-references inside a document that automatically update themselves as the document is revised. These uses, however, will have to wait for the development of several next-generation technologies. For now, you must be content with precisely identifying the part of a document you want to jump to when following an XLink.

XPointer Examples

HTML links generally point to one particular document. Additional granularity — that is, pointing to a particular section, chapter, or paragraph of a particular document — isn't well supported. Provided you control both the linking and the linked document, you can insert a named anchor into an HTML file at the position to which you want to link. For example:

<H2><A NAME="xtocid20.2">XPointer Examples</A></H2>

You can then link to this position in the file by adding a # and the name of the anchor to the URL. The piece of the URL after the # is called the fragment identifier. For example, in this link the fragment identifier is xtocid20.2.

<A HREF="">
  XPointer Examples

However, this solution is kludgy. It's not always possible to modify the target document so that the source document can link to it. The target document may be on a different server controlled by someone other than the author of the source document. And the author of the target document may change or move it without notifying the author of the source.

Furthermore, named anchors violate the principle of separating markup from content. Placing a named anchor in a document says nothing about the document or its content. It's just a marker for other documents to refer to. It adds nothing to the document's own content.

XPointers allow much more sophisticated connections between parts of documents. An XPointer can refer to any element of a document; to the first, second, or seventeenth element; to the seventh element named P, to the first element that's a child of the second DIV element, and so on. XPointers provide very precisely targeted addresses of particular parts of documents. They do not require the targeted document to contain additional markup just so its individual pieces can be linked to.

Furthermore, unlike HTML anchors, XPointers don't point to just a single point in a document. They can point to entire elements, to possibly discontiguous sets of elements, or to the range of text between two points. Thus, you can use an XPointer to select a particular part of a document, perhaps so it can be copied or loaded into a program.

Here are a few examples of XPointers:


Each of these selects a particular element in a document. The first finds the element with the ID ebnf. The second finds the second language element in the document . The third is a shorthand form of finding the element with the ID ebnf. The fourth and fifth both specify the second language child element of any child element of the body child elements of the spec child of the root node. The sixth finds the second child element of the fourteenth child element of the root element. The final URI also points to the element with the ID ebnf. However, if no such element is present, it then finds the element with the ID EBNF.

The document is not specified by the XPointer; rather, the URI that precedes the XPointer specifies the document. This URI may be contained in an XLink linking element, in an XInclude include element, or in something else. The XLinks and URIs you saw in the previous chapter did not contain XPointers, but it isn't hard to add XPointers to them. Most of the time you simply append the XPointer to the URI separated by a #, just as you do with named anchors in HTML. For example, the above list of XPointers could be suffixed to URLs and come out looking similar to the following:"ebnf"))[position()=2])*/child::language[2])*/language[2])"ebnf"))xpointer(id("EBNF"))

In fact, these URIs are just six different ways of pointing to the same element of the document at Normally such URIs are values of the xlink:href attribute of a linking element. For example:

<SPECIFICATION xmlns:xlink=""
 xlink:actuate="onRequest" xlink:show="replace">
  Extensible Markup Language (XML) 1.0

XPointers don't have any special exemptions from the rules of URIs. In particular, if the XPointer contains characters that are not allowed in URLs, (for example, Ω or ^) then these characters must be encoded in UTF-8, and the bytes of the UTF-8 encoding must be hex-escaped using a percent sign. For example, the capital Greek letter Omega is Unicode character 3A9 in hexadecimal. When encoded in UTF-8, this character is the two bytes 206 and 169. In hexadecimal, that's CE and A9. Therefore, the XPointer xpointer(id("Ω")) would be encoded in a URL as xpointer(id("%CE%A9")). The caret is Unicode character 5E in hexadecimal. The equals sign is Unicode character 3D in hexadecimal. The colon is Unicode character 3A in hexadecimal. Because these three characters are part of the ASCII character set, their UTF-8 encodings are simply their values. Therefore xpointer(descendant::*[.='^']) would be encoded in a URL as xpointer(descendant%3A%3A*[.%3D'%5E']). Modern Web browsers allow the square brackets [ and ] in URLs. However, some older browsers do not, so for maximum compatibility you should escape these characters as %5B and %5D respectively. Thus the above XPointer would become xpointer(descendant%3A%3A*%5B.%3D'%5E'%5D).

A Concrete Example

To demonstrate the different types of XPointers, it's useful to have a concrete example in mind. Listing 20-1 is a simple, valid document that should be self-explanatory. It contains information about two related families and their members. The root element is FAMILYTREE. A FAMILYTREE can contain PERSON and FAMILY elements. Each PERSON and FAMILY element has a required ID attribute. Persons contain a name, birth date, death date and spouse. Families contain a husband, a wife, and zero or more children. The individual persons are referred to from the family by reference to their IDs.


This XML application is revisited in Chapter 28.

Listing 20-1: A family tree

<?xml version="1.0"?>
  <!-- PERSON elements -->
    ID      ID     #REQUIRED
  <PERSON ID="p1">
    <NAME>Domeniquette Celeste Baudean</NAME>
    <BORN>21 Apr 1836</BORN>
    <SPOUSE IDREF="p2"/>
  <PERSON ID="p2">
    <NAME>Jean Francois Bellau</NAME>
    <SPOUSE IDREF="p1"/>
  <PERSON ID="p3" FATHER="p2" MOTHER="p1">
    <NAME>Elodie Bellau</NAME>
    <BORN>11 Feb 1858</BORN>
    <DIED>12 Apr 1898</DIED>
    <SPOUSE IDREF="p4"/>
  <PERSON ID="p4">
    <NAME>John P. Muller</NAME>
    <SPOUSE IDREF="p3"/>
  <PERSON ID="p7">
    <NAME>Adolf Eno</NAME>
    <SPOUSE IDREF="p6"/>
  <PERSON ID="p6" FATHER="p2" MOTHER="p1">
    <NAME>Maria Bellau</NAME>
    <SPOUSE IDREF="p7"/>
  <PERSON ID="p5" FATHER="p2" MOTHER="p1">
    <NAME>Eugene Bellau</NAME>
  <PERSON ID="p8" FATHER="p2" MOTHER="p1">
    <NAME>Louise Pauline Bellau</NAME>
    <BORN>29 Oct 1868</BORN>
    <DIED>3 May 1938</DIED>
    <SPOUSE IDREF="p9"/>
  <PERSON ID="p9">
    <NAME>Charles Walter Harold</NAME>
    <BORN>about 1861</BORN>
    <DIED>about 1938</DIED>
    <SPOUSE IDREF="p8"/>
  <PERSON ID="p10" FATHER="p2" MOTHER="p1">
    <NAME>Victor Joseph Bellau</NAME>
    <SPOUSE IDREF="p11"/>
  <PERSON ID="p11">
    <NAME>Ellen Gilmore</NAME>
    <SPOUSE IDREF="p10"/>
  <PERSON ID="p12" FATHER="p2" MOTHER="p1">
    <NAME>Honore Bellau</NAME>
  <FAMILY ID="f1">
    <HUSBAND IDREF="p2"/>
    <WIFE IDREF="p1"/>
    <CHILD IDREF="p3"/>
    <CHILD IDREF="p5"/>
    <CHILD IDREF="p6"/>
    <CHILD IDREF="p8"/>
    <CHILD IDREF="p10"/>
    <CHILD IDREF="p12"/>
  <FAMILY ID="f2">
    <HUSBAND IDREF="p7"/>
    <WIFE IDREF="p6"/>

In the sections that follow, this document is assumed to be present at the URL This isn't a real URL, but the emphasis here is on selecting individual parts of a document rather than a document as a whole.

Location Paths, Steps, and Sets

Many (though not all) XPointers are location paths. These are the same location paths used by XSLT and discussed in Chapter 17. Consequently, much of the syntax should already be familiar to you.

Location paths are built from location steps. Each location step specifies a point in the targeted document, always relative to some other well-known point such as the start of the document or the previous location step. This well-known point is called the context node. In general, a location step has three parts: the axis, the node test, and an optional predicate. These are combined in this form:


For example, in the location step child::PERSON[position()=2], the axis is child, the node-test is PERSON, and the predicate is [position()=2]. This location step selects the second PERSON element along the child axis, starting from the context node or, less formally, the second PERSON child element of the context node. Of course, which element this actually is depends on what the context node is. Consequently, this is what's referred to as a relative location step. There are also absolute location steps that do not depend on the context node.

The axis tells you in what direction to search from the context node. For instance, an axis can say to look at things that follow the context node, things that precede the context node, things that are children of the context node, things that are attributes of the context node, and so forth.

The node test tells you which nodes to consider along the axis. The most common node test is simply an element name. However the node test may also be the asterisk (*) wild card to indicate that any element is to be matched, or one of several functions for selecting comments, text, attributes, processing instructions, points, and ranges. The group of nodes along the given axis that satisfy the node test form a location set.

The predicate is a boolean expression (exactly like the expressions you learned about in XSLT) that tests each node in that set. If that expression returns false, then the node is removed from the set.

Often, after the entire location step — axis, node test, and predicate — has been evaluated, what's left is a single, unique node. A location set like this with only one node is called a singleton. However, not all location steps produce singletons. In some cases, you may finish with multiple nodes in the final location set. On occasion, there may be no nodes in the location set; in other words, the location set is the empty set.

A single location step is often not enough to identify the node you want. Commonly, location steps are strung together, separated by slashes, to form a location path. Each location step's location set becomes the context node set for the next step in the path. For example, consider this XPointer:


The location path of this XPointer is /child::FAMILYTREE/child::PERSON[position()=3]. It is built from two location steps:

The first location step is an absolute step that selects all child elements of the root node whose name is FAMILYTREE. When applied to Listing 20-1, there's exactly one such element. The second location step is then applied relative to the FAMILYTREE element returned by the first location step. All of its child nodes are considered. Those that satisfy the node test — that is, elements whose name is PERSON — are returned. There are 12 of these nodes. Each of these 12 nodes is then compared against the predicate to see if its position is equal to 3. This turns out to be true for only one node, Elodie Bellau's PERSON element, so that is the single node this XPointer points to.

It is not always the case, however, that an XPointer points to exactly one node. For instance, consider this XPointer:


This is exactly the same as before except that the equals sign has been changed to a greater than sign. Now when each of the 12 PERSON elements are compared, the predicate returns true for 9 of them. Each of these nine is included in the location set that this XPointer returns. This XPointer points to nine nodes, not to one.

The Root Node

Although Listing 20-1 includes ID attributes for most elements, and although they are convenient, they are not required for linking into the document. You can select any element in the document simply by working your way down from the root node. An initial / indicates the root node.

The root node of the document is not the same as the root element. Rather it is an abstract node that contains the entire document including the XML declaration, the document type declaration, any comments or processing instructions that come before or after the root element such as xml-stylesheet, and the root element itself. For example, to select the root node of the XML 1.0 specification at you can use this URI:

For another example, Domeniquette Celeste Baudean is the first person in Listing 20-1. Therefore to point at her name, you can get the first element child of the root node (that is, the root element of the document, FAMILYTREE), then count one PERSON down from the root element, and then count one NAME down from that like this:


This location path says to find the root node, then find all element children of the root node (which in a well-formed XML document will be exactly the root element), then find the first PERSON element that's an immediate child of that element, and then find its NAME child elements.


XPath defines 13 axes along which an XPointer may search for nodes, all from the same XPath syntax used for XSLT. These depend on context to determine exactly what they point to. For instance, consider this location path:


It begins with the id() function that returns a node set containing the element with the ID type attribute whose value is p6. This provides a context node for the following location step along the relative child axis. Other axes include ancestor, descendant, self, ancestor-or-self, descendant-or-self, attribute, and more. Each serves to select a particular subset of the elements in the document. For instance, the following axis selects from nodes that come after the context node. The preceding axis selects from nodes that come before the context node. Table 20-1 summarizes the 13 axes.

Table 20-1: Location Step Axes


Selects From:


All nodes contained in the context node, but not contained in any other nodes the context node contains


The unique node that contains the context node but that does not contain any other nodes that also contain the context node


The context node


The parent of the context node, the parent of the parent of the context node, the parent of the parent of the parent of the context node, and so forth, back to the root node


The ancestors of the context node and the context node itself


The attributes of the context node


The children of the context node, the children of the children of the context node, and so forth


The context node itself and its descendants


All nodes that start after the end of the context node, excluding attribute and namespace nodes


All nodes that start after the end of the context node and have the same parent as the context node


All namespaces defined for the context node


All nodes that finish before the beginning of the context node, excluding attribute and namespace nodes


All nodes that start before the beginning of the context node and have the same parent as the context node

The child axis

The child axis selects from the children of the context node. For example, consider this XPointer:


Reading from right to left, it selects the NAME child elements of the third PERSON element that's a child of the FAMILYTREE element that's a child of the root of the document. In this example, there's only one such element; but if there are more than one, then all are returned. For instance, consider this XPointer:


This selects all NAME children of PERSON elements that are children of FAMILYTREE elements that are children of the root. There are a dozen of these in Listing 20-1.

It's important to note that the child axis only selects from the immediate children of the context node. For example, consider this URI:

This points nowhere because there are no NAME elements in the document that are direct, immediate children of the root node. There are a dozen NAME elements that are indirect children. If you'd like to refer to these, you should use the descendant axis instead of child.

As in XSLT, the child axis is implied if no explicit axis name is present. For instance, the above three XPointers would more likely be written in this abbreviated form:


The descendant axis

The descendant axis searches through all the descendants of the context node, not just the immediate children. For example, /descendant::BORN selects all the BORN elements in the document. /descendant::BORN[position()=3] selects the third BORN element encountered in a depth-first search of the document tree. (Depth first is the order you get if you simply read through the XML document from top to bottom.) In Listing 20-1, that selects Louise Pauline Bellau's birthday, <BORN>29 Oct 1868</BORN>.

The descendant axis can be abbreviated by using a double slash in place of a single slash. For example, //BORN[position()=3] also selects the third BORN element encountered in a depth-first search of the document tree. //NAME selects all NAME elements in the document. //PERSON/NAME selects all NAME children of PERSON elements.

The descendant-or-self axis

The descendant-or-self axis searches through all the descendants of the context node and the context node itself. For example, id("p11")/descendant-or-self::PERSON refers to all PERSON children of the element with ID p11 as well as that element itself, because it is of type PERSON. There is no abbreviation for descendant-or-self.

The parent axis

The parent axis refers to the node that's the immediate parent of the context node. For example, /descendant::HUSBAND[position()=1]/parent::* refers to the parent element of the first HUSBAND element in the document. In Listing 20-1, this is the FAMILY element with ID f1.

Without a node test the parent axis can be abbreviated by a .. as in //HUSBAND[position()=1]/...

The self axis

The self axis selects the context node. It's sometimes useful when making relative links. For example, /self::node() selects the root node of the document (which is not the same as the root element of the document; that would be selected by /child::* or, in this example, /child::FAMILYTREE.) It can be abbreviated by a single period. However, this axis is rarely used in XPointers. It's more useful for XSLT select expressions.

The ancestor axis

The ancestor axis selects all nodes that contain the context node, starting with its parent. For example, /descendant::BORN[position()=2]/ancestor::*[position()=1] selects the element that contains the second BORN element. Applied to Listing 20-1, it selects Elodie Bellau's PERSON element. There's no abbreviation for the ancestor axis.

The ancestor-or-self axis

The ancestor-or-self axis selects the context node and all nodes that contain it. For example, id("p1")/ancestor-or-self::* identifies a node set that includes Domeniquette Celeste Baudean's PERSON element, that has ID p1, and its parent, the FAMILYTREE element, and its parent, the root node. There's also no abbreviation for the ancestor-or-self axis.

The preceding axis

The preceding axis selects all nodes that finish before the context node. The first time it encounters an element's start tag or empty element tag, moving backwards from the start of the context node, it counts that element. For example, consider this rule:


This says go to the third BORN element from the root, Louise Pauline Bellau's birthday, <BORN>29 Oct 1868</BORN>, and then move back six elements. This lands on Maria Bellau's NAME element. There's no abbreviation for the preceding axis.

The following axis

The following axis selects all elements that occur after the context node's closing tag. The first time it encounters an element's start tag or empty element tag, it counts that element. For example, consider this rule:


This says go to Elodie Bellau's birthday, <BORN>11 Feb 1858</BORN>, and then move forward five elements. This lands on John P. Muller's SPOUSE element, <SPOUSE IDREF="p3" />, after passing through Elodie Bellau's DIED element, Elodie Bellau's SPOUSE element, John P. Muller's PERSON element and John P. Muller's NAME element, in this order. There's no abbreviation for the following axis.

The preceding-sibling axis

The preceding-sibling axis selects elements that precede the context node in the same parent element. For example, /descendant::BORN[position()=2]/preceding-sibling::*[position()=1] selects Elodie Bellau's NAME element, <NAME>Elodie Bellau</NAME>. /descendant::BORN[position()=2]/preceding-sibling::*[position()=2] doesn’t point to anything because there's only one sibling of Elodie Bellau's BORN element before it. There's no abbreviation for the preceding-sibling axis.

The following-sibling axis

The following-sibling axis selects elements that follow the context node in the same parent element. For example, /descendant::BORN[position()=2]/following-sibling::*[position()=1] selects Elodie Bellau's DIED element, <DIED>12 Apr 1898</DIED>. /descendant::BORN[position()=2]/following-sibling::*[position()=3] doesn't point to anything because there are only two sibling elements following Elodie Bellau's BORN element. There's no abbreviation for the following-sibling axis.

The attribute axis

The attribute axis selects atributes of the context node. For example, the location path /descendant::SPOUSE/attribute::IDREF selects all IDREF attributes of all SPOUSE elements in the document. The attribute axis can be abbreviated by an @ sign. Thus, //SPOUSE/@IDREF also selects all IDREF attributes of all SPOUSE elements in the document. @* is a general abbreviation for an attribute with any name. Thus //SPOUSE/@* indicates all attributes of all SPOUSE elements.

For another example, to find all PERSON elements in the document whose FATHER attribute is Jean Francois Bellau (ID p2), you could write //PERSON[@FATHER="p2"].

The xmlns and xmlns:prefix attributes used to declare namespaces are not attribute nodes. To get information about namespaces, you have to use the namespace axis instead.

The namespace axis

The namespace axis contains the namespaces in scope on the context node. It only applies to element nodes. There is one namespace node for each prefix that is mapped to a URI on that element (whether the prefix is used or not, and whether the xmlns:prefix attribute that created the mapping is on the element itself or one of its ancestors). Furthermore, if the element is in a default, nonprefixed namespace, then there is also a namespace node for the default namespace.

Namespace nodes are very slippery and hard to grab hold of. Although the element is the parent of the namespace node, the namespace node is not the child of the element. A simple walk of the tree or asking for the children of the element will not find the namespaces of the element. Instead, you have to walk the namespace axis explicitly. The only node test that applies to namespace nodes is node().

Fortunately, there's very little reason to point to a namespace node with an XPointer. This axis is more useful for XSLT and not much used in XPointer.

Node Tests

Most of the time the node test part of a location step is simply an element or attribute name like PERSON or @IDREF. However, there are nine other possibilities:

An asterisk stands for any element. For example, id("p1")/child::* selects all the child elements of the element with the ID p1 regardless of their type. This does, however, select only element nodes. It omits comment nodes, text nodes, processing instruction nodes, and attribute nodes. If you want to select absolutely any kind of node, use the node() node test instead.

A prefix followed by an asterisk selects all elements in the namespace that match the prefix. For example, if the svg prefix is mapped to the URI, then svg:* matches all SVG elements. Similarly, @prefix:* matches all attributes in the specified namespace. For instance, if xlink is mapped to the URI, then @xlink:* matches all XLink attributes in the document such as xlink:type, xlink:show, xlink:actuate, xlink:href, xlink:role, and so forth.

Determining which namespace URIs a prefix is mapped to can be tricky. If the XPointer is used in an XML document, then the normal xmlns:prefix attributes in scope where the XPointer is used determine which namespace URI a prefix maps to. However, XPointers can also be used in non-XML documents. For instance, an XPointer may be included as a URL fragment identifier in a link to an XML document from an HTML page. HTML has no means of associating prefixes with URIs. In this case, you can prefix the xpointer() part with one or more an xmlns(prefix=URI) parts that establish a prefix mapping.

For example, suppose you want to point at the MathML math element in the document at You know that this element is in the namespace, but you don’t know what prefix is used in the document. Regardless of what prefix the target document uses, you can use the prefix mml as long as you use an xmlns(mml= part to associate it with the right URI. For example,

xmlns(mml= xpointer(//mml:math[1])

The text() node test specifically refers to the parsed character data content of an element. It's most commonly used with mixed content. Despite the parentheses, the text() node test does not actually take any arguments. For instance /descendant::text() refers to all of the text but none of the markup of a document. For another example, consider this CITATION element:

    <AUTHOR>Turing, Alan M.</AUTHOR>
    "<TITLE>On Computable Numbers,
      With an Application to the Entscheidungs-problem</TITLE>"
      Proceedings of the London Mathematical Society</JOURNAL>,
    <SERIES>Series 2</SERIES>,

The following location path refers to the quotation mark before the TITLE element.


The first text node in this fragment is the white space between <CITATION CLASS="TURING" ID="C2"> and <AUTHOR>. Technically, this location path refers to all text between </AUTHOR> and <TITLE>, including the white space and not just the quotation mark.


XPointers that point to text nodes are tricky. I recommend that you avoid them if possible. Of course, you may not always be able to.

Because character data does not contain any child nodes, child, descendant, descendant-or-self, and attribute relative location steps may not be attached to an XPath that selects a text node. The exception is the point() node test which is discussed later.

The comment() node test specifically refers to comments. For example, this XPointer points to the third comment in the document:


Because comments do not contain attributes or elements, you cannot add an additional child, descendant, or attribute relative location step after the first term that selects a comment. Despite the parentheses, the comment() node test does not actually take any arguments.

Finally, the processing-instruction() node test selects any processing instructions that occur along the chosen axis. You can use it without any arguments to select all processing instructions, or with an argument to specify the targets of the particular processing instructions you want to select. For example, /descendant::processing-instruction() selects all processing instructions in the document. However, /descendant::processing-instruction('xml-stylesheet') only finds processing instructions that begin <?xml-stylesheet . /descendant::processing-instruction("php") only finds processing instructions intended for PHP. As with comments, because processing instructions do not contain attributes or elements, you cannot add an additional child, descendant, or attribute relative location step after the first step that selects a processing instruction.

The point() and range() node tests refer to new ways of dividing an XML document that only work in XPointer, not in other standards that use XPath, such as XSLT. They will be discussed below.


Each location step can contain zero or more predicates that further restrict which nodes an XPointer points to. In many cases a predicate is necessary to pick the one node from a node set that you want. This uses the same syntax as you already learned about from XSLT. Each predicate contains an expression in square brackets ([]). This allows an XPointer to select nodes according to many different criteria. For example, you can select:

These are just a small sampling of the selections that predicates make possible.

The result of a predicate expression is ultimately converted to a boolean after all calculations are finished. Nonboolean results are converted as follows:

The predicate expression is evaluated for each node in the context node list. Each node for which the expression ultimately evaluates to false is removed from the list. Thus only those nodes that satisfy the predicate remain. I will not repeat the discussion of the operators and functions available to use expressions here . However, I will show you a few examples of predicates using the expression syntax as it's likely to be used in XPointers.


Expression syntax is covered in Chapter 17.

Probably the most frequently used function in XPointer predicates is position(). This returns the index of the node in the context node list. This enables you to find the first, second, third, or other indexed node. You can compare positions using the relational operators <, >, =, !=, >=, and <=.

For instance, in Listing 20-1 the root FAMILYTREE element has 14 immediate children, 12 PERSON elements, and 2 FAMILY elements. In order, they are:


In fact, this test is so common that XPath offers a shorthand notation for it. Instead of writing [position=X] where X is a number, you can simply enclose the number or an XPath expression that returns the number in the square brackets like this:


Greater numbers, such as /child::FAMILYTREE/child::*[15], don't point to anything; they're just dangling.

To count all elements in the document, not just the immediate children of the root, you can use the descendant axis instead of child. Table 20-2 shows the first four descendant XPointers for the document element FAMILYTREE of Listing 20-1, and what they point to. Note especially that /child::FAMILYTREE/descendant::*[position()=1] points to the entire first PERSON element, including its children, and not just the <PERSON> start tag.

Table 20-2: The First Four Descendants of the Document Element


Points To:


<PERSON ID="p1">

<NAME>Domeniquette Celeste Baudean</NAME>

<BORN>11 Feb 1858</BORN>

<DIED>12 Apr 1898</DIED>




<NAME>Domeniquette Celeste Baudean</NAME>


<BORN>21 Apr 1836</BORN>



Functions that Return Node Sets

XPointers are not limited to location paths. In fact they can use any expression that returns a node set. In particular, they can use functions that return node sets. There are three of these:

The last two, here() and origin() are XPointer extensions to XPath that are not available in XSLT.


The id() function is one of the simplest and most robust means of identifying an element node. It selects the element in the document that has an ID type attribute with a specified value. For example, consider this URI:"p12"))

If you look at Listing 20-1, you find this element:

<PERSON ID="p12" FATHER="p2" MOTHER="p1">
  <NAME>Honore Bellau</NAME>

Because ID type attributes are unique, you know there aren't any other elements that match this XPointer. Therefore,"p12")) must refer to Honore Bellau's PERSON element. Note that the XPointer points to the entire element to which it refers, including all its children, not just the start tag.

Since ID pointers are so common and so useful, there's also a shortcut for this. If all you want to do is point to a particular element with a particular ID, you can skip all the xpointer(id("")) frou frou and just use the bare ID after the # like this:

You can only do this if all you want is the particular element with the particular ID. You cannot add additional relative location steps to a URI that uses this shortcut to select children of the element with ID p12 or the third attribute of the element with ID p12. If you want to do that, you have to use the full xpointer(id("p12")) syntax.

The disadvantage of the id() function is that it requires assistance from the targeted document. If the element you want to point to does not have an ID type attribute, you're out of luck. If other elements in the document have ID type attributes, you may be able to point to one of them and use a relative location step to point to the one you really want. Nonetheless, ID type attributes work best when you control both the targeted document and the linking document, so that you can ensure that the IDs match the links even as the documents evolve and change over time.

If the document does not have a DTD, then it cannot have any ID type attributes, although it may have attributes named ID. In this case, you can't point at anything using the id() function.

One possibility is to first use an id()-based XPointer, but back it up with an XPointer that looks for the attribute with the specific name anywhere in the document, ID in this example. Simply append the second XPointer to the first like this:


XPointers are evaluated from left to right. The first match found is returned, so the backup is only used if an ID type attribute with the value p12 can't be found.


The second node set returning function is here(). However, it's only useful when used in conjunction with one or more relative location steps. In intradocument links, that is, links from one point in a document to another point in the same document, it's often necessary to refer to "the next element after this one," or "the parent element of this element." The here() function refers to the node that contains the XPointer so that such references are possible.

Consider Listing 20-2, a simple slide show. In this example, here()/../following::SLIDE[1] refers to the next slide in the show. here()/../preceding::SLIDE[1] refers to the previous slide in the show. Presumably, this would be used in conjunction with a style sheet that showed one slide at a time.

Listing 20-2: A slide show

<?xml version="1.0"?>
<SLIDESHOW xmlns:xlink="">
    <H1>Welcome to the slide show!</H1>
    <BUTTON xlink:type="simple"
    <H1>This is the second slide</H1>
    <BUTTON xlink:type="simple"
    <BUTTON xlink:type="simple"
    <H1>This is the third slide</H1>
    <BUTTON xlink:type="simple"
    <BUTTON xlink:type="simple"
    <H1>This is the last slide</H1>
    <BUTTON xlink:type="simple"

Generally, the here() function is only used in fully relative URIs in XLinks. If any URI part is included, it must be the same as the URI of the current document.


The origin() function is much the same as here(); that is, it refers to the source of a link. However, origin() is used in out-of-line links where the link is not actually present in the source document. It points to the element in the source document from which the user activated the link.


Selecting a particular element or node is almost always good enough for pointing into well-formed XML documents. However, on occasion you may need to point into XML data in which large chunks of non-XML text is embedded via CDATA sections, comments, processing instructions, or some other means. In these cases, you may need to refer to particular ranges of text in the document that don't map onto any particular markup element. Or, you may need to point into non-XML substructure in the text content of particular elements; for example the month in a BORN element that looks like this:

<BORN>11 Feb 1858</BORN>

An XPath expression can identify an element node, an attribute node, a text node, a comment node, or a processing instruction node. However, it can't indicate the first two characters of the BORN element (the date) or the substring of text between the first space and the last space in the BORN element (the month).

XPointer generalizes XPath to allow identifiers like this. An XPointer can address points in the document and ranges between points. These may not correspond to any one node. For instance, the place between the X and the P in the word XPointer at the beginning of this paragraph is a point. The place between the t and the h in the word this at the end of the first sentence of this paragraph is another point. The text fragment "Pointer generalizes XPath to allow pointers like t" between those two points is a range.

Every point is either between two nodes or between two characters in the parsed character data of a document. To make sense of this, you have to remember that parsed character data is part of a text node. For instance, consider this very simple but well-formed XML document:


There are exactly 3 nodes and 14 distinct points in this document. The nodes are the root node, which contains the GREETING element node, which contains a text node. In order the points are:

Points allow XPointers to indicate arbitrary positions in the parsed character data of a document. They do not, however, enable pointing at a position in the middle of a tag . In essence, what points add is the ability to break up the text content into smaller nodes, one for each character.

A point is selected by using the string-range() function to select a range, then using the start-point () or end-point () function to extract the first or last point from the range. For example, this XPointer selects the point immediately before the D in Domeniquette Celeste Baudean's NAME element:

xpointer(start-point(string-range (id('p1')/NAME,"Domeniquette")))

This XPointer selects the point after the last e in Domeniquette :


You can also take the start-point () or end-point () of an element, text, comment, processing instruction, or root node to get the first or last point in that node.


Some applications need to specify a range across a document rather than a particular point in the document. For instance, the selection a user makes with a mouse is not necessarily going to match up with any one element or node. It may start in the middle of one paragraph, extend across a heading and a picture, and then into the middle of another paragraph two pages down.

Any such contiguous area of a document can be described with a range. A range begins at one point and continues until another point. The start and end points are each identified by a location path. If the starting path points to a node set rather than a point, then range-to () will return multiple ranges, one starting from the first point of ecah node in the set.

To specify a range, you append /range-to(end-point) to a location path specifying the start point of the range. The parentheses contain a location path specifying the end point of the range. For example, suppose you want to select everything between the first <PERSON> start tag and the last </PERSON> end tag in Listing 20-1. This XPointer accomplishes that:


Range functions

XPointer includes several functions specifically for working with ranges. Most of these operate on location sets. A location set is just a node set that can also contain points and ranges, as well as nodes.

The range(location-set) function returns a location set containing one range for each location in the argument. The range is the minimum range necessary to cover the entire location. In essence, this function converts locations to ranges.

The range-inside(location-set) function returns a location set containing the interiors of each of the locations in the input. That is, if one of the locations is an element, then the location returned is the content of the element (but not including the start and end tags). However, if the input location is a range or point, then the interior of the location is just the same as the range or point.

The start-point(location-set) function returns a location set that contains the first point of each location in the input location set. For example, start-point(//PERSON[1]) returns the point immediately after the first <PERSON> start tag in the document. start-point(//PERSON) returns the set of points immediately after each <PERSON> start tag.

The end-point(location-set) function acts the same as start-point() except that it returns the points immediately after each location in its input.

String ranges

XPointer provides some very basic string-matching capabilities through the string-range() function. This function takes as an argument a location set to search and a substring to search for. It returns a location set containing one range for each nonoverlapping matching substring. You can also provide optional index and length arguments indicating how many characters after the match the range should start and how many characters after the start the range should continue. The basic syntax is:

string-range(location-set, substring, index, length)

The first argument is an XPath expression that returns a location set specifying which part of the document to search for a matching string. The second substring argument is the actual string to search for. By default, the range returned starts before the first matched character and encompasses all the matched characters. However, the index argument can give a positive number to start after the beginning of the match. For instance, setting it to 2 indicates that the range starts with the second character after the first matched character. The length argument can specify how many characters to include in the range.

A string range points to an occurrence of a specified string, or a substring of a given string in the text (not markup) of the document. For example, this XPointer finds all occurrences of the string Harold:


You can change the first argument to specify what nodes you want to look in. For example, this XPointer finds all occurrences of the string Harold in NAME elements:


String ranges may have predicates. For example, this XPointer finds only the first occurrence of the string Harold in the document:


This targets the position immediately preceding the word Harold in Charles Walter Harold's NAME element. This is not the same as pointing at the entire NAME element as an element-based selector would do.

A third numeric argument targets a particular position in the string. For example, this targets the point between the l and d in the first occurrence of the string Harold because d is the sixth letter:


An optional fourth argument specifies the number of characters to select. For example, this URI selects the old from the first occurrence of the entire string Harold:


If the first string argument in the node test is the empty string, then relevant positions in the context node's text contents are selected. For example, the following XPointer targets the first six characters of the document's parsed character data:


For another example, let's suppose that you want to find the year of birth for all people born in the nineteenth century. The following will accomplish that:

xpointer(string-range(//BORN, " 18", 2, 4))

This says to look in all BORN elements for the string " 18". (The initial space is important to avoid accidentally matching someone born in 1918 or on the 18th day of the month.) When it's found, move one character ahead (to skip the space) and return a range covering the next four characters.

When matching strings, case is considered. Markup characters are ignored.

Child Sequences

The two most common ways to identify an element in an XML document are by ID and by location. Identifying an element by ID is accomplished through the id() function. Identifying an element by location is generally accomplished by counting children down from the root. For example, the following URIs both point to John P. Muller's PERSON element:"p4"))*[position()=1]/child::*[position()=4])

A child sequence is a shortcut for XPointers, like the second example above — that is, an XPointer that consists of nothing but a series of child relative location steps counting down from the root node, each of which selects a particular child by position only. The shortcut is to use only the position number and the slashes that separate individual elements from each other, like this:

/1/4 is a child sequence that selects the fourth child element of the first child element of the root. This syntax can be extended for any depth of child elements. For example these two URIs point to John P. Muller's NAME and SPOUSE elements respectively:

Child sequences may include an initial ID. In that case, the counting begins from the element with that ID rather than from the root. For example, John P. Muller's PERSON element has an ID attribute with the value p4. Consequently xpointer(p4/1) points to his NAME element and xpointer(p4/2) points to his SPOUSE element.

Each child sequence always points to a single element. You cannot use child sequences with any other relative location steps. You cannot use them to select elements of a particular type. You cannot use them to select attributes or strings. You can only use them to select a single element by its relative location in the tree.


In this chapter, you learned about XPointers. In particular you learned that:

The next chapter explores the Resource Description Framework, RDF, an XML application for encoding metadata.

[ Cafe con Leche | Order from ]

Copyright 2001, 2002 Elliotte Rusty Harold
Last Modified December 31, 2002