Much of the code in this book has involved navigating the tree structure of an XML document to find particular nodes. For example, the XML-RPC servlet in read a client request looking for int elements. Such code can become quite involved and fragile if you aren’t very careful. As the code walks down the tree hierarchy, loading one child after the other, a single misplaced or misnamed element may cause the program to fail. If an element isn’t where it’s expected to be, the chain of method calls that gives directions to the desired elements will be broken. What’s needed is a way to specify which nodes a program needs without explicitly specifying how the program navigates to those nodes.

Queries

XPath can be thought of as a query language like SQL. However, rather than extracting information from a database, it extracts information from an XML document. An example should help make this more concrete. Consider the simple weather report document in .

The XPath Data Model

An XPath query operates on a namespace well-formed XML document after it has been parsed into a tree structure. The particular tree model XPath uses divides each XML document into seven kinds of nodes:

Location Paths

Although there are many different kinds of XPath expressions, the one that’s of primary use in Java programs is the location path. A location path selects a set of nodes from an XML document. Each location path is composed of one or more location steps. Each location step has an axis, a node test and, optionally, one or more predicates. Furthermore, each location step is evaluated with respect to a particular context node. A double colon (::) separates the axis from the node test, and each predicate is enclosed in square brackets.

Axes

There are twelve axes along which a location step can move. Each selects a different subset of the nodes in the document, depending on the context node. These are:

Node tests

The axis chooses the direction to move from the context node. The node test determines what kinds of nodes will be selected along that axis. The node tests are:

Predicates

Each location step can have zero or more predicates that further filter the node-set. A predicate is an XPath expression in square brackets that is evaluated for each node selected by the location step. If the predicate is true, then the node is kept in the node-set. If the predicate is false, then the node is removed from the node-set. For example, given the same SOAP request document, suppose the context node is now the SOAP-ENV:Body element and that the stk prefix is mapped to the http://namespaces.cafeconleche.org/xmljava/ch2/ namespace URI. This location step returns a node-set containing all the Quote elements whose price is less than ten:

Compound Location Paths

The forward slash (/) combines location steps into a location path. The node-set selected by the first step becomes the context node-set for the second step. The node-set identified by the second step becomes the context node-set for the third step, and so on.

Absolute Location Paths

So far all the location paths have been relative to a specified context node. To date, I’ve just identified that context node in prose. When we begin discussing XPath APIs, you’ll see that most methods for evaluating an XPath expression have a context node argument. However, not all location paths require context nodes. In particular, a location path that begins with a forward slash (/) is an absolute path that starts at the root node of the document (not the root element but the root node).

Abbreviated Location paths

XPath location paths can use the abbreviations listed in in location paths. The semantics are the same. The syntax is just a little easier to type.

Combining location paths

Occasionally it’s useful to select a node-set that’s built from multiple, more or less unrelated parts of an XML document. For example, you might want to select all the Price elements and all the Quote elements in a document. //stk:Price selects all the prices. //stk:Quote selects all the quotes. You can use the vertical bar, |, to combine these two node-sets into one.

Expressions

Not all XPath expressions are location paths. In fact, you’ve already seen several that weren’t. The content of the square brackets in a location step predicate is a more generic form of XPath expression. Each XPath 1.0 expression returns one of these four types:

Literals

XPath defines literal forms for strings and numbers. Numbers have more or less the same form as double literals in Java. That is, they look like 72.5, -72.5, .5321, and so forth. XPath only uses floating point arithmetic, so integers like 42, -23, and 0 are also number literals. However, XPath does not recognize scientific notation such as 5.5E-10 or 6.022E23.

Operators

XPath provides the following operators for basic floating point arithmetic:

Functions

XPath defines a number of useful functions that operate on and return the four fundamental XPath data types. Some of these take variable numbers of arguments. In the list below, optional arguments are suffixed with a question mark. A function that doesn’t have any arguments normally operates on the context node instead. For the most part these functions are weakly typed. You can pass any of the four types in the place of an argument that is declared to be of type boolean, number, or string. XPath will convert it and use it. The exceptions are those functions that are declared to take node-sets as arguments. XPath cannot convert arguments of other types to node-sets.

XPath Engines

There are several good open source XPath engines for Java, most distributed as part of XSLT processors. These include:

XPath with Saxon

The Saxon 6.5 API is rather convoluted and involves over 200 different classes in 18 different packages. Fortunately you can ignore most of these for basic XPath searching. The most common sequence of steps to search a document is:

XPath with Xalan

The Xalan-J XSLT processor from the Apache XML Project also includes an XPath API that’s useful for navigation in DOM programs. Underneath the hood, the basic design is strikingly similar to Saxon’s for two independently developed programs. However, Xalan does have one class that Saxon doesn’t which significantly simplifies life for developers, org.apache.xpath.XPathAPI. This class, shown in provides static methods that handle many simple use-cases without lots of preliminary configuration.

DOM Level 3 XPath

The Saxon API only works with Saxon. The Xalan API only works with Xalan. Both only work with Java. The W3C DOM Working Group is attempting to define a standard, cross-engine XPath API that can be used with many different XPath engines (though as of Summer 2002 this effort is just beginning and is not yet supported by any implementations). DOM Level 3 includes an optional XPath module in the org.w3c.dom.xpath package. The feature string "XPath" with the version "3.0" tests for the presence of this module. For example,

Namespace Bindings

Because the SOAP request document uses namespace qualified elements, however, we’ll first have to provide some namespace bindings that can be used when evaluating the XPath expression. The XPathNSResolver interface provides the namespace bindings. Although you can implement this in any convenient class, an instance is normally created by passing a Node with all necessary bindings to the createNSResolver() method of the XPathEvaluator interface. For example, this code uses JAXP to build a very simple document whose document element binds the prefix SOAP to the URI http://schemas.xmlsoap.org/soap/envelope/ and the prefix f to the URI http://namespaces.cafeconleche.org/xmljava/ch3/. Then that document element is passed to the XPathEvaluator’s createNSResolver() method to create an XPathNSResolver object that has the same namespace bindings as the synthetic node we created.

Snapshots

Iterators like this one are only good for a single pass. You cannot reuse them or back up in them. Furthermore, if the Document object over which the iterator is traversing changes before you're finished with the iterator (e.g. a node in the iterator is deleted from the Document object) then iterateNext() throws a DOMException with the code INVALID_STATE_ERR.

Compiled Expressions

An XPath engine that implements the DOM XPath API may need to compile the expression into some internal form rather than simply keeping it as a generic String. The XPathExpression interface, shown in , represents such a compiled expression.

Jaxen

The Jaxen Java XPath Engine is an open source cross-API (DOM, JDOM, dom4j, and ElectricXML) XPath library for Java. Whereas DOM3 XPath attempts to be a cross-implementation, cross-language XPath API for the Document Object Model alone, Jaxen attempts to be a cross-model API for its own XPath engine. It is a Java class library that can operate on various XML object models using a standard engine rather than an API that can be offered by many different engines for one model. It allows you to pass DOM, JDOM, dom4j, and ElectricXML objects directly to XPath functions like position() and translate(). Furthermore, whereas DOM 3 XPath offers fairly rudimentary interfaces for evaluating XPath expressions in a a particular document against a context node, Jaxen is a more complete object model for XPath expressions.

Summary

XPath is a straightforward declarative language for selecting particular subsets of nodes from an XML document. Its data model is not quite the same as DOM’s, but that’s normally not a major problem. In fact in some cases such as taking the string-value of an element, the XPath data model is likely to be a lot closer to what you want than the DOM data model.