Jaxen

The Jaxen Java XPath Engine is an open source cross-API (DOM, JDOM, dom4j, and ElectricXML) XPath library for Java. Whereas DOM3 XPath attempts to be a cross-implementation, cross-language XPath API for the Document Object Model alone, Jaxen attempts to be a cross-model API for its own XPath engine. It is a Java class library that can operate on various XML object models using a standard engine rather than an API that can be offered by many different engines for one model. It allows you to pass DOM, JDOM, dom4j, and ElectricXML objects directly to XPath functions like position() and translate(). Furthermore, whereas DOM 3 XPath offers fairly rudimentary interfaces for evaluating XPath expressions in a a particular document against a context node, Jaxen is a more complete object model for XPath expressions.

Jaxen’s class library is quite large, on a par with Saxon’s or Xalan’s. It includes classes representing each of XPath’s functions, iterators, node tests, and axes. Fortunately, you don’t need to know all that to perform simple searches. Indeed, Jaxen’s basic API is probably the simplest of the XPath APIs discussed in this chapter. The main interface you need is XPath. There are four implementations of this interface in four different packages, one each for DOM, JDOM, dom4j, and ElectricXML: org.jaxen.dom.DOMXPath, org.jaxen.jdom.JDOMXPath, org.jaxen.dom4j.Dom4jXPath, and org.jaxen.exml.ElectricXPath. I’ll demonstrate this API with the DOM implementation, but the patterns are much the same for the other three APIs.

The following steps search an XML document with Jaxen:

  1. Construct an XPath object by passing a String containing an XPath expression to the model-specific constructor:

    public DOMXPath(String expression)
        throws JaxenException;

    public JDOMXPath(String expression)
        throws JaxenException;

    public Dom4jXPath(String expression)
        throws JaxenException;

    public ElectricXPath(String expression)
        throws JaxenException;
  2. Set the namespace bindings by calling addNamespace() for each namespace binding the XPath expression uses:

    public void addNamespace(String prefix, String uri)
        throws JaxenException;

    (You can skip this step if the XPath expression doesn’t use any prefixed names.)

  3. Invoke one of the following methods to evaluate the expression, depending on what type of result you expect or want:

    public Object evaluate(Object context)
        throws JaxenException;

    public List selectNodes(Object context)
        throws JaxenException;

    public Object selectSingleNode(Object context)
        throws JaxenException;

    public String stringValueOf(Object context)
        throws JaxenException;

    public boolean booleanValueOf(Object context)
        throws JaxenException;

    public Number numberValueOf(Object context)
        throws JaxenException;

For an example, let’s rewrite the Fibonacci XML-RPC client one last time. However, this time, I’ll use Jaxen:

  public static void readResponse(InputStream in) 
   throws IOException, SAXException, TransformerException,
   ParserConfigurationException, JaxenException {

    DocumentBuilderFactory factory 
     = DocumentBuilderFactory.newInstance();
    factory.setNamespaceAware(true);   
    DocumentBuilder builder = factory.newDocumentBuilder();    
    
    InputSource data = new InputSource(in);
    Node doc = builder.parse(data);
    
    // There are different XPath classes in different packages
    // for the different APIs Jaxen supports
    XPath expression = new org.jaxen.dom.DOMXPath(
     "/SOAP:Envelope/SOAP:Body/f:Fibonacci_Numbers/f:fibonacci");
    expression.addNamespace("f", 
     "http://namespaces.cafeconleche.org/xmljava/ch3/");
    expression.addNamespace("SOAP", 
     "http://schemas.xmlsoap.org/soap/envelope/");
    Navigator navigator = expression.getNavigator();

    List results = expression.selectNodes(doc);
    Iterator iterator = results.iterator();
    while (iterator.hasNext()) {
      Node result = (Node) iterator.next();
      String value = StringFunction.evaluate(result, navigator);
      System.out.println(value);
    }

  }

As usual, first JAXP reads the document from the InputStream. Next a new Jaxen DOMXPath object is constructed from the String form of the XPath expression. If I were using Jaxen on top of JDOM, I would have constructed an org.jaxen.jdom.JDOMXPath object here instead. If I were using Jaxen on top of dom4j, I would have constructed an org.jaxen.dom4j.Dom4jXPath object here instead.

After the expression is created, I immediately bind all the namespace prefixes it uses by invoking the addNamespace() method. This location path uses two different namespace prefixes, so I call addNamespace() twice. Then I get the expression’s Navigator by invoking getNavigator(). In more advanced programs, you can use the Navigator class directly to move around the tree. Here, however, I just need this to pass as an argument to another method in a few lines. Finally, I pass the context node to the selectNodes() method to get a list of all the nodes in that document that satisfy the location path. In this case, the context node is the document itself because the location path is an absolute path. However, in other programs you might well pass an element or some other kind of node instead.

Jaxen’s selectNodes() method returns a standard java.util.List which can be iterated through in the usual way. Since this XPath expression operated on a DOM document and returned a node-set, the items in the list are all some form of DOM Node objects. This location path only selected elements so here they’re actually all DOM Element objects. If the Jaxen XPath were operating on a JDOM document, then the list would contain JDOM objects. If the Jaxen XPath were operating on a dom4j document, then the list would contain dom4j objects, and so on. In most cases you’ll want to cast the item in the list to some more specific type before continuing.

As the program iterates through the list, it deals with each node independently. I could use the DOM methods discussed in Chapters 9 through 13 to work with these nodes. However, what I really want is to get the XPath string-value of each node. This is provided by the Xpath string() which Jaxen represents as the org.jaxen.function.StringFunction class. The evaluate() method in this class applies the XPath string function to a specified object (here a DOM Node) in the context of a particular Jaxen Navigator. It returns the XPath string-value of the object.

Note

This is a little more convoluted than it perhaps needs to be because there’s no XPath 1.0 way to return a list of strings. For example, string(node-set) returns the string-value of the first node in the set rather than a list of the string-values of each node in the set. That’s why I have to move from XPath to DOM (where I can work with lists and sets) and back to XPath again rather than working with a single XPath expression that returns the final result. XPath is not Turing complete. Some of the logic is going to have to be implemented in Java.

In fact, Jaxen’s org.jaxen.function package provides Java representations of most of the functions in XPath 1.0: BooleanFunction, CeilingFunction, ConcatFunction, etc. Each of these classes has a static evaluate() method that invokes the function and returns the result. The argument lists and return types of this method change from function to function as appropriate for each function. In a few cases where the XPath function has a variable length argument list, the Jaxen function class uses overloaded evaluate() methods instead. These classes and their corresponding evaluate() methods are:

BooleanFunction
public static Boolean evaluate(Object o, Navigator navigator);
CeilingFunction
public static Double evaluate(Object o, Navigator navigator);
ConcatFunction
public static String evaluate(List list, Navigator navigator);
ContainsFunction
public static Boolean evaluate(Object string, Object match, Navigator navigator);
CountFunction
public static Number evaluate(Object node-set);
FalseFunction
public static Boolean evaluate();
FloorFunction
public static Double evaluate(Object o, Navigator navigator);
IdFunction
public static List evaluate(List contextNodes, Object arg, Navigator navigator);
LastFunction
public static Double evaluate(Context context);
LocalNameFunction
public static String evaluate(List node-set, Navigator navigator);
NameFunction
public static String evaluate(List node-set, Navigator navigator);
NamespaceUriFunction
public static String evaluate(List node-set, Navigator navigator);
NormalizeSpaceFunction
public static String evaluate(Object string, Navigator navigator);
NotFunction
public static Boolean evaluate(Object object, Navigator navigator);
NumberFunction
public static Double evaluate(Object object, Navigator navigator);
PositionFunction
public static Double evaluate(Context context);
RoundFunction
public static Double evaluate(Object object, Navigator navigator);
StartsWithFunction
public static Boolean evaluate(Object string, Object match, Navigator navigator);
StringFunction
public static String evaluate(Object object, Navigator navigator);
StringLengthFunction
public static Number evaluate(Object object, Navigator navigator);
SubstringAfterFunction
public static String evaluate(Object string, Object match, Navigator navigator);
SubstringBeforeFunction
public static String evaluate(Object string, Object match, Navigator navigator);
SubstringFunction
public static String evaluate(Object string, Object start, Navigator navigator);
public static String evaluate(Object string, Object start, Object length, Navigator navigator);
SumFunction
public static Double evaluate(Object node-set, Navigator navigator);
TranslateFunction
public static Boolean evaluate(Object original, Object from, Object to, Navigator navigator);
TrueFunction
public static Boolean evaluate();

Copyright 2001, 2002 Elliotte Rusty Haroldelharo@metalab.unc.eduLast Modified June 04, 2002
Up To Cafe con Leche