The XMLFilterImpl Class

SAX includes an adapter class you can subclass to build these sorts of two-way filters, org.xml.sax.helpers.XMLFilterImpl. Its general design is similar to what I’ve shown above. However, it implements all the relevant interfaces in one class:

public class XMLFilterImpl implements XMLFilter, 
 EntityResolver, DTDHandler, ContentHandler, ErrorHandler

When the various setter methods like setContentHandler() and setErrorHandler() in this class are invoked, the handler is stored in a private field. For example, here’s the setContentHandler() method:

    public void setContentHandler (ContentHandler handler)
    {
        contentHandler = handler;
    }

When the parse() method is called, it swaps out all the installed handlers for the XMLFilterImpl object itself:

    private void setupParse ()
    {
        if (parent == null) {
          throw new NullPointerException("No parent for filter");
        }
        parent.setEntityResolver(this);
        parent.setDTDHandler(this);
        parent.setContentHandler(this);
        parent.setErrorHandler(this);
    }

When the parent parser calls back to the ContentHandler methods, the XMLFilterImpl passes the call back to the original ContentHandler object stored in the contentHandler field. For example, here’s the startElement() method:

    public void startElement (String uri, String localName, String qName,
      Attributes atts)
      throws SAXException
    {
      if (contentHandler != null) {
        contentHandler.startElement(uri, localName, qName, atts);
      }
    }

The other callback methods are similar. Thus by default, XMLFilterImpl doesn’t filter anything, much like the earlier TransparentFilter example. However, you can subclass this class and override those methods where you want to change the data passed back. You pass your changed data by invoking the usual callback methods in this class. Since you may have overridden the relevant methods in a subclass, you may need to use super to access the methods in XMLFilterImpl directly.

For example, the startElement() method in Example 8.12 adds an id attribute to every element that doesn’t already have one, and then passes that modified element on to the underlying content handler to do whatever it needs to do.

Example 8.12. A subclass of XMLFilterImpl

import org.xml.sax.*;
import org.xml.sax.helpers.*;
import java.util.*;


public class IDFilter extends XMLFilterImpl {

  public void startElement(String namespaceURI, String localName,
   String qualifiedName, Attributes atts) throws SAXException {    

    boolean hasID = false;
    for (int i = 0; i < atts.getLength(); i++) {
      if (atts.getQName(i).equalsIgnoreCase("id") ||
       atts.getType(i).equals("ID")) {
         hasID = true;
         ids.add(atts.getValue(i));
         break; 
      }
    } 

    if (!hasID) {
      AttributesImpl newAttributes = new AttributesImpl(atts);
      String idValue = makeID();
      newAttributes.addAttribute("", "id", "id", "ID", idValue);
      atts = newAttributes;
    }
    super.startElement(namespaceURI, localName, qualifiedName, 
     atts);

  }
  
  // need to track which IDs we've already used, including IDs
  // that were included in the document
  int id = 1;
  private Set ids; // requires Java 1.2
  
  public void startDocument() {
    // reinitialize id list for each document
    ids = new HashSet();
    id = 1;
  }
  
  // Generate an ID that hasn't been used yet
  private String makeID() {
    
    while (ids.contains("_" + id)) id++;
    ids.add("_" + id);
    return "_" + id;
    
  }
  
}

You’ll notice that this code is much shorter and simpler than the programs that implemented XMLFilter directly. There’s a lot of code inside XMLFilterImpl you can reuse without a lot of thought. When subclassing XMLFilterImpl, you only need to override the methods that implement the filter. The remaining methods can be left to the superclass. In fact, it is so much easier to use XMLFilterImpl rather than XMLFilter that almost all real-world filters are based on XMLFilterImpl. A few books even ignore the existence of the XMLFilter interface completely. I mostly covered it here because I spent a lot of time being confused by XMLFilter since I didn’t realize how much more XMLFilterImpl does. It is not just an implementation of the XMLFilter interface.

Since XMLFilterImpl is still an XMLReader, the client application uses it like it would use any other XMLReader, by setting handlers, features, and properties and then parsing documents. The only difference is that the client application needs to pass an actual parser object to the setParent() method before doing anything else.

Here’s the beginning of the output from when I used IDFilter and FilterTester on the RDDL specification, after the usual adjustments for line length. The initial doc processing instruction is an artifact of the XMLWriter class.

% java -Dorg.xml.sax.driver=gnu.xml.aelfred2.XmlReader
  FilterTester http://www.rddl.org/ IDFilter
<?doc type="doctype" role="title" {Resource Directory Description Language 1.0 } ?>
<html xml:lang="en" xml:base="http://www.rddl.org/" 
 version="-//XML-DEV//DTD XHTML RDDL 1.0//EN" id="_1" 
 xmlns="http://www.w3.org/1999/xhtml">
<head profile="" id="_2">
      <title id="_3">
      XML Resource Directory Description Language (RDDL)</title>
<link href="xrd.css" type="text/css" rel="stylesheet" 
 id="_4"></link>
</head>
<body id="_5">
<h1 id="_6">Resource Directory Description Language (RDDL)</h1>
<div class="head" id="_7">
<p id="_8">This Version: 
<a href="http://www.openhealth.org/RDDL/20010305" id="_9"> 
March 5, 2001</a></p>
…

Copyright 2001, 2002 Elliotte Rusty Haroldelharo@metalab.unc.eduLast Modified December 02, 2001
Up To Cafe con Leche