XML News from Thursday, October 5, 2006

Stefano Mazzocchi has released Gadget, an open source (BSD license) XML inspector based on XPath that analyzes XML documents too large to fit into RAM. According to Mazzocchi, "I was given the task of transforming a few hundred Mb of XML into RDF and I found out (the hard way!) that with that amount of data things start to break down: you need radically different approaches since you can't simply open your 100Mb XML document in your browser to take a look at it. Before writing Gadget I used a collection of 12-stages-long grep+sed+sort+uniq pipelines to understand what I had in that big XML pile, but that started to become a little cumbersome so I wrote this." Gadget is written in Java.


The W3C HTML Working Group has released the final recommendation of XHTML-Print. According to the abstract, "XHTML-Print is member of the family of XHTML languages defined by the Modularization of XHTML [XHTMLMOD]. It is designed to be appropriate for printing from mobile devices to low-cost printers that might not have a full-page buffer and that generally print from top-to-bottom and left-to-right with the paper in a portrait orientation. XHTML-Print is also targeted at printing in environments where it is not feasible or desirable to install a printer-specific driver and where some variability in the formatting of the output is acceptable." In essence, this subsets XHTML with the features appropriate for printing. For instance, frames are not supported because "Frames depend on a screen interface and therefore are not applicable to printers."