Summary

SAX filters sit between a client application and a parser (XMLReader) and intercept calls from the client application to the parser. SAX filters are instances of implementations of the org.xml.sax.XMLFilter interface, a subinterface of XMLReader. Thus a SAX filter is also a parser, albeit one that receives its data from another XML parser rather than by directly reading an XML document.

The easiest way to write a SAX filter is by subclassing org.xml.sax.helpers.XMLFilterImpl, which implements several interfaces including XMLFilter, ContentHandler, DTDHandler, and ErrorHandler. org.xml.sax.helpers.XMLFilterImpl intercepts calls in both directions, from the client to the parser and the parser to the client. The default behavior of this class is to pass all events along unchanged. By overriding the standard methods of the various handler interfaces, you can change the data a client application receives from the parser. This gives the filter the opportunity to log, modify, block, supplement, or replace each call.

Document well-formedness is normally verified against the original text document by the parser. Most client applications assume the data they receive through SAX is well-formed. For instance, they assume there is only a single root element. However, it’s possible for a filter to violate these assumptions. For instance, the root element start-tag and end-tag could be filtered out while leaving the contents intact. Illegal characters like null and vertical tab could be passed to the characters() method. Unless you know exactly how your handlers will respond to such malformed event streams, you should be careful to make sure that your filters maintain well-formedness, and, if necessary, validity.


Copyright 2001, 2002 Elliotte Rusty Haroldelharo@metalab.unc.eduLast Modified November 30, 2001
Up To Cafe con Leche