2001 XML News

Monday, December 31, 2001

Sun's posted the Java XML Pack Winter 01 Release. This bundles together the:

  • Java API for XML Messaging (JAXM) v1.0.1 EA1
  • Java API for XML Processing (JAXP) v1.2 EA1
  • Java API for XML Registries (JAXR) v1.0 EA
  • Java API for XML-based RPC (JAX-RPC) v1.0 EA1
Sunday, December 30, 2001

I've tried the latest Mozilla 0.9.7 on both Windows and the Mac. The Windows build fixes the one bug I had noticed in 0.9.6 (a minor problem involving saearch pop-up menus in the location bar). It also lets me configure cookie preferences to exclude them from sites with bad privacy policies, though perhaps that was in earlier versions and I just noticed it now. Windows Mozilla definitely feels ready for prime time.

I wish I could say the same about the Mac version though. It's noticeably slower than IE5.0 for the Mac, and it crashed my systems within minutes of launch. Furthermore, it still has some annoying problems with AppleScript that prevent me from using it as my default browser. In particular, it cannot tell which window is in front. get the front window returns the first window opened, not the window that is currently in front.

Saturday, December 29, 2001

The Mozilla Project has posted version 0.97 of the Mozilla web browser for the usual batch of platforms. New features include basic S/MIME support, favicon support, and a Document Inspector that provides live editing of the DOM of any web document or XUL application.


Antenna House has released version 2.0 of their XSL Formatter, a Windows GUI program for viewing and printing XSL formatting objects documents. The major new features are compatibility with the final XSL 1.0 recommendation and the ability to generate PDF documents from XSL-FO documents.

XSL Formatter costs roughly $2000 for your first copy, significantly less for each additional copy. I have to say that this strikes me as vastly overpriced. For $80, the cost of an additional license, I would have bought a copy to play around with. For $400, I would have downloaded the evaluation version, checked it out, and if it seemed to work on my documents I would have bought a copy. But for $2000, I'll continue to use FOP and PassiveTeX.

Friday, December 28, 2001

I've returned to New York and my backups. Cafe au Lait and Cafe con Leche should be fully functional again.

Thursday, December 27, 2001

Norm Walsh has published version 2.0.3 of DocBook: The Definitive Guide. DocBook is an XML application for technical documentation used in the Linux Documentation Project, my own Processing XML with Java, and of course the DocBook book itself. This is a minor update.

Tuesday, December 25, 2001

Merry Christmas, Bon Noel, Feliz Navidad, Happy Holidays, a Festive Kwanzaa, Joyeuse Fetes, Season's Greetings, Happy Hanukkah, a joyous Winter Solstice and all that. I have a small present for everyone today. Chapter 9 of Processing XML with Java has been posted. This chapter begins the coverage of the Document Object Model (DOM) by discussing its underlying data model and the Node and NodeList interfaces. I hope you like it. :-)

Monday, December 24, 2001

As usual, many W3C working groups have pushed out a lot of new and revised working drafts in the week before Christmas. I don't have time to analyze them all now, but if you're interested the latest are:

I may say more about some of these when I return to New York.

Saturday, December 22, 2001

Yes, I know a runaway cron job replaced the December news with news from the same date last December. Possibly a crontab file I'd deleted got accidentally restored in a backup. Much more likely, I stupidly forgot to delete the old crontab file in the first place. Mea culpa. Mea maxima culpa.

I do have a copy of the old December news that got overwritten. Unfortunately, the old news is sitting in my Brooklyn apartment two thousand miles away from my parents' house in New Orleans where I'm spending the holidays, and for once I did not make complete copies of my web sites on my laptop. I tried, but Aladdin's StuffIt Deluxe crapped out when it was trying to make the necessary zip files, so I didn't get that done. I probably (no, make that definitely) should have figured out what the problem was and brought a backup with me; but I stupidly figured I could just download it off the web and didn't want to waste the extra time.

I do have a recent copy of the site on DLT tape with me in case my apartment building goes up in flames while I'm away, but unless I can find someone friendly in New Orleans with Retrospect and a DLT drive, that isn't any immediate help. There are mirror sites, but they haven't been updating since IBiblio switched to Linux and broke the mirror logins months ago. :-(

The short version is that news from the last week or two is temporarily lost in the ether. It will come back shortly before New Year's. I will still be posting new news here during the meantime. Thanks to everyone who wrote in to warn me of the problem.

Friday, December 21, 2001

The W3C XSLT Working Group has posted the first public working drafts of XSLT 2.0 and XPath 2.0. The big new feature in these releases is schema-type awareness, but there's lots more here too including improved grouping, an XHTML output method, integer and single precision arithemtic, number formatting that does not depend on Java, and way too much to talk about right before I fly to New Orleans tomorrow for the holidays. <BlatantPlug>If you want to know more about this come here me talk about it on Monday, March 11, at XMLOne London.</BlatantPlug> by which point I should have had enough time to digest the spec and decide what I think of it.

Michael Kay, the editor of the XSLT 2.0 spec, has posted the first release of Saxon 7.0, an experimental and incomplete implementation of these XSLT 2.0 and XPath 2.0 working drafts. Production use should continue to rely on XSLT 1.0/XPath 1.0/Saxon 6.5.

Thursday, December 20, 2001

Microsoft's released Internet Explorer 5.1 for MacOS 8 and 9. This release makes numerous bug fixes and improvements in CSS support. Particularly notable is the ability to apply CSS style sheets to XML documents (a feature the Windows version of IE has had for several years.) Other new features include the ability to drag and drop arbitrary images and hyperlinks to the button bar.

Wednesday, December 19, 2001

The W3C has published four new and revised working drafts about SOAP 1.2 including for the first time a SOAP primer:

The SOAP 1.2 namespace is now http://www.w3.org/2001/12/ and will almost certainly chnge again before the final release. The MIME type has changed from text/xml to application/soap+xml to finally application/soap (which I really don't understand the logic behind. It is still XML after all.) The SOAP schemas are now compliant with the final release of the W3C XML Schema Language. Finally, the spec now denies that SOAP is an acronym for "Simple Object Access Protocol" or anything else.


According to several people TypeOf is a reasonable equivalent to instanceof in Visual Basic. Thanks to Philip Nelson, Robert A. Casola, and Rob Smith for helping out with this.

The revised question is what languages don't have a reasonable equivalent to instanceof? So far the only ones I've found are JavaScript 1.3 and earlier and some older C++ compilers pre-RTTI. Are there any other lanaguages out there to which the DOM IDL could theoretically be compiled, but which do not have some variant of instanceof?

Tuesday, December 18, 2001

Brendon McKenna showed me where to look for an equivalent of instanceof in Perl. Short answer: it doesn't have one. Long answer, the ref() function lets you check a variable's immediate class through string comparison, but not the variable's various superclasses. Update: The isa() function will correctly compare against superclasses even if ref() won't.

Now I need to ask the same question for Visual Basic. Does it have any reasonable equivalent of Java's instanceof operator or any means determining at runtime the type of an unknown object? Please send any answers to elharo@ibiblio.org. Thanks!



The XML Apache Project has posted the fourth beta of Xerces-J 2.0. They claim this release is "near production quality." Beta 4 fixes a number of bugs, introduces more changes to the Xerces Native Interface, provides a partial experimental DOM Level 3 implementation, and includes full XML Schema support including Post Schema Validation Infoset information.

Monday, December 17, 2001

For the next chapter of Processing XML with Java, I'm trying to find out if Perl has any reasonable equivalent of Java's instanceof operator or any means determining at runtime the type of an unknown object. Please send any answers to elharo@ibiblio.org. Thanks!


X-Hive Corporation has released version 2.0 of its X-Hive/DB native XML database for Windows and Unix. New features in version 2.0 include version management, BLOB support, and full text searching. X-Hive/DB 2.0 sells for 495 Euros (or ten times that if you want support).

Sunday, December 16, 2001

A number of XML specifications define application-specific document object models that extend the standard DOM. These include WML, SVG, HTML, SMIL, and MathML. For the next chapter of Processing XML with Java, I'm looking for parsers that can build these application specific DOMs. So far I've found that the Docuverse DOM SDK can build an HTML DOM and with some effort Xerces can be made to build HTML and WML DOMs. Does anybody know of any others? I'm particularly interested in a parser that could build a MathML or SVG DOM. Please send any suggestion to elharo@ibiblio.org. Thanks!

Saturday, December 15, 2001

Simon St. Laurent's published the first alpha version of Markup Object Events (MOE), a Mozilla-licensed Java class library that supports markup processing using both events and object trees. Accorsing to St. Laurent, "MOE programs can work purely with events, purely with trees, or with combinations of both, effectively providing a "middle way" between SAX and DOM." MOE permits all nodes to have:

  • A three-part namespace-aware name (prefix, local name, URI - QName available)
  • Unordered content (a set) - think attributes
  • Ordered content (a list) - think child elements, text, etc.
  • Annotations (a map) - any other information you need, largely unconstrained

My initial impression is that this goes a long way toward reinventing DOM, in a slightly more concrete and Java-centric fashion, with perhaps the one major change that nodes do not have to belong to a document, but can live independently. However, MOE's CoreComponentI really feels to me exactly like DOM's Node interface, even to the point of using integer constants to represent node types.

Friday, December 14, 2001

VMTools 0.4 is an open source Java library for comparing and representing the differences between XML documents. It's similar to diff and patch for ASCII text, but is aware of significant and insignificant XML structure.

Thursday, December 13, 2001

The W3C XML Core Working Group has just published a very poorly thought-out initial working draft of XML 1.1. This basically just describes the changes that will be necessary to satisfy the very controversial XML Blueberry requirements. There's one big change since the Blueberry requirements were published: C0 controls except for null (e.g. bell, vertical tab, formfeed, etc.) will now be allowed in XML documents. For the life of me I can't figure out what this is supposed to gain anybody.

The restrictions on name characters will be loosened considerably in XML 1.1, so that all Unicode characters which are not specifically forbidden as name characters can now be used in element, attribute, entity and notation names. This will allow native script markup in Ge'ez, Amharic, Tigre, Burmese, Cambodian, and a few other languages, as well as filling a few very minor holes in Chinese and Japanese support. However, the proposal goes way beyond this. Many undefined code points are now allowed in names, as are some very weird characters like ©, ±, 7 (&0x2077;, superscript 7), the musical symbol for a six-string fretboard, and the zero-width space.

With perhaps one exception, the working group ignored or rejected every single proposal that was made to reduce the massive damage this will do to interoperability. To summarize:

  • The version number will now be 1.1 instead of 1.0. Even though most of the changes in this draft have little to no effect on the vast majority of users, I expect to be hearing from publishers and readers within 24 hours about when I'll be releasing The XML 1.1 Bible, XML 1.1 in a Nutshell, etc. These titles aren't actually necessary, but I'm sure myself and other authors will waste a few acres of trees on them nonetheless.

  • Many documents will be published with a 1.1 version even though they do not need to be. Existing systems that attempt to process these documents will fail. Consequently, many organizations will be faced with large and unnecessary expenditures to upgrade software in order to support features most users don't actually need.

  • IBM will now be allowed to use non-ASCII characters as white space. This will make their lazy programmers happy because they still don't have to conform to the standards the rest of world has been happily using for over two decades. However, the rest of us will now have to deal with IBM generated XML that cannot be edited in a simple text editor and cannot be processed with our existing tools.


The W3C HyperText Coordination Group has published a note about the requirements for a Component Extension (CX) API. To quote, from the

From the early days of the World Wide Web, Web Agents had been extended to support more types of contents. The recent developments of XML and the possibility to mix multiple XML Namespaces in the document reiterated the need to extend implementations and relaying on add-on softwares to accomplish tasks not supported by default in the implementation. In other words, we have several XML languages to represent different parts of Web pages (XHTML, SVG, MathML, XForms, etc.), we now need a well defined mechanism that allow different specialized tools to work together and handled these compound documents.

This W3C Note contains a non-exhaustive list of requirements to work on a Component Extension API. The goal of this API is to extend the ability of a Web application. Note that the Web application can be either on the server side or on a client side, and does not automatically implies interaction with a user or having a Web browser.

Wednesday, December 12, 2001

Altova has released XML Spy 4.2 Suite, a $399 payware XML editor for Windows. Upgrades from 4.0 are free. XML Spy 4.2 Suite consists of three products: XML Spy 4.2 Document Editor, XML Spy 4.2 XSLT Designer, and XML Spy 4.2 Integrated Development Environment. Version 4.2 adds SOAP support through WSDL, Oracle XML Schema Extensions, MSXML4 Support, various XSLT Designer editing enhancements, extended Document Editor APIs, and various other improvements. Version 4.0 was unusable. I'm going to test this version and see if it's any better.

Update: 4.2 is a definite improvement on 4.0. It no longer hangs for several minutes when trying to open my test documents (Docbook documents for various chapters of Processing XML with Java) so it's no longer completely non-functional. Thus now I have a real chance to actually evaluate the product and its design.

After opening my test documents, it was far from obvious how to work with them. I could browse them in a tree view fairly easily, and it wasn't too hard to add a new element declared in the DTD or a new text node. However, I could not figure out how to do something as simple as delete a word from the middle of a sentence in an existing text node. There may be a way to do this, but it wasn't obvious at first glance. It certainly wasn't as simple as selecting the word I wanted to delete and pressing the delete key.

I also found the tree view to be a very unnatural way to edit the sorts of document I write. I much prefer writing in a basic text editor like UltraEdit or BBEdit, even if it means I have to type the tags myself. Christian Gross showed me a more word processor like view of the document at Software Development in Boston last August, but it's not the default view and I wasn't able to figure out how to turn it on. I suppose I should probably read the manual.

Update 2: I found the text view that let's me edit the document more naturally by typing words in a row interspersed with tags. This is moderately useful. I loved the code-completion pop-ups for tag names. Unfortunately, depending on how you're typing in the tags, this has an annoying tendency to delete the following word if there's no white space between the tag and the following word. :-( The more word processor like document view apparently requires me to create a custom configuration file in the XSLT designer. I'm not quite sure how to do that since the XSLT designer doesn't have the customary File/New menu item, and the XSLT designer can't seem to open any of my files (DTDs, XSLT stylesheets, or instance documents). I should probably read the manual again.

Update 3: OK the online help showed me what I needed. I didn't have to resort to the manual (I'm assuming there is a manual somewhere. I haven't actually looked for it yet. :-)) Apparently, the XSLT designer should work by opening a DTD or a W3C XML Schema Language schema; and indeed I was able to get it to work with simple examples from the XML Bible. However, it simple could not handle the much more complicated real-world DocBook DTD. Consequently, this means I can't use XMLSpy to edit my books just yet. :-(


Unicode 3.2 beta data files have been posted. Version 3.2 adds 1016 new characters, new properties, additional conformance clauses, and various textual clarifications. New scripts include Tagalog, Hanunoo, Buhid, Tagbanwa, a large collection of mathematical symbols, and small sets of other letters and symbols.

Tuesday, December 11, 2001

The W3C HTML Activity has posted a new working draft of XForms 1.0. This describes the "next generation of web forms". Today's HTML forms don't distinguish between the purpose and the presentation of a form. XForms, by contrast, have separate sections that describe what the form does and how the form looks, thus making XForms more suitable for a broad range of media from web browsers to cell phones. The XForms themselves and the data collected in an XForm and returned to a server are all written in XML. An XForms Submit Protocol defines how XForms send and receive data, including the ability to suspend and resume the completion of a form. The changes in this release are too numerous to list here, but are nicely summarized in an appendix.

Monday, December 10, 2001

The W3C/IETF URI working group has published a new working draft of Internationalized Resource Identifiers (IRI). An IRI uses Unicode to locate resources in a syntax very similar to that of URIs, and a hex-encoded mapping from IRIs to URIs is defined. However, an IRI is not a kind of URI and could not be directly used in XML documents in places where URIs are now required (e.g. system identifiers).


WH2FO is an open source Java application that processes an HTML document saved from Word 2000, and transforms it into an XML document and an XSLT stylesheet. The XSLT stylesheet can then transform the XML document into an XSL-Formatting Objects document. You can also apply a stylesheet that converts the XML back into HTML discarding all the extra markup added by Word.


PXDB is a simple Python application that takes XML files and places them into a SQL database. There are APIs for querying / altering / deleting / ordering / linking. PXDB is like a middle ground between SQL database and XML repository. One can view PXDB as a strange object database on top of SQL storage, with some elements of XML (CPath). This is a very rough alpha right now.

Sunday, December 9, 2001

Netscape's released version 6.2.1 of their namesake web browser that supports XML, CSS, and XSLT. This is the first release with full native support for MacOS X. Versions are also available for Windows, classic MacOS, and various Unixes. Otherwise, it's not clear what if anything changed. It still seems based on the Mozilla code 0.9.4 code base.

Saturday, December 8, 2001

The W3C XML Schema test collection has been updated with some new tests contributed by Sunt. Updated test output from both XSV and MSV/Crimson have also been posted.


The W3C DOM and HTML working groups have posted a last call working draft of Document Object Model (DOM) Level 2 HTML Specification Version 1.0. Comments are due by January 7.

Friday, December 7, 2001

The Apache XML Project has released Xerces-C++ 1.6.0, an open source XML parser that supports SAX2 and DOM2, written in reasonably portable C++. The bug news for this release is full support for the W3C XML Schema Language (modulo bugs). Also included is a FreeBSD port and many bug fixes, memory leak fixes and performance improvements.


The Apache XML Project has released version 1.1 of Batik, an open source SVG browser and class library written in Java. This release improves performance and conformance. Newly implemented features include xml:base, improved text support, improved conditional processing support and complete CSS support. Scripting and SMIL animation are not yet supported.

Thursday, December 6, 2001

Apparently the xml-dev mailing list was hacked recently, and the subscriber list was deleted. It's not clear when the list will be back to normal. Even though the problem has allegedly been fixed, I for one am still not receiving any mail from the list, even though the subscription manager tells me I'm subscribed. Update: I've just started receiving mail from the list. If you still aren't getting anything, try unsubscribing and resubscribing.


Design Science, Inc. has released the WebEQ Developers Suite, a $495 payware Java toolkit for building Web pages that include dynamic math via MathML. The WebEQ Developers Suite includes five components: the Editor for authoring "presentation" and "content" MathML; the Publisher for processing HTML pages with MathML and WebTeX; the Input Control which functions as a graphical equation editor in a Web page; the Equation Server which works behind the scenes to facilitate batch processing and processing via scripts on a server; and the Viewer Control that displays MathML in any browser.


Yann Dirson's posted the first public beta of sgml2x, a bash script for formatting SGML and XML documents using DSSSL stylesheets. sgml2x requires jade and jadetex, and you'll need docbook (sgml or xml), and docbook-dsssl to make immediate use of it. It was developed as a bash script, and should run on any platform where bash runs.


The NetBeans XML team has posted the first preview release of the NetBeans TAX library. As near as I can tell, this is a Java bean that "allows structure manipulation of whole XML and DTD documents" inside the NetBeans open source Integrated Development Environment (IDE). TAX "includes event, traversal and I/O support." TAX requires a specially patched version of Xerces 2 from NetBeans.

Wednesday, December 5, 2001

Amelia A. Lewis noticed that the xml-dev mailing list appears to have been down since the weekend, and I verified that with a recnet message of my own. If any of the maintainers are reading this, could you please look into the problem and let us know what's happening? Thanks.


IBM's alphaWorks has released ToXgene, a "template-based generator for complex, semantically-correlated collections of XML documents. The data generation process in ToXgene is based on a conceptual description of the data to be generated (the templates). This tool is intended for cases in which the structure of the data to be generated is known, the data is required to conform to that structure, and multiple collections of documents, with varying structures, sizes and complexities, can easily be generated." In other words, it automatically produces lots of sample documents for an XML application.


XPathTester is an open source Java2 based tool for visualizing the results of XPath queries. "Type in a query, hit enter, see the resulting value displayed or nodeset highlighted in the tree."


The W3C XSL Working Group has decided to open the XSL-editors mailing list to the general public "in order to exchange views on the XSLT language and its development. In order to keep traffic to a reasonable level, it is strongly discouraged to ask general questions about the use of XSLT, which pertain to xsl-list. The moderator reserves the right to unsubscribe people who will post off-topic messages." To subscribe send em-amil to xsl-editors-request@w3c.org from the address you wish to subscribe.

Tuesday, December 4, 2001

I've posted the first draft of SAX Filters, Chapter 8 of Processing XML with Java here on Cafe con Leche. This chapter covers the XMLFilter interface and the XMLFilterImpl and AttributesImpl classes. This chapter demonstrates how to write filters that modify the stream of events that flows between an XML parser and a client application.

This is a longish chapter, and some of the examples are a bit on the large side. I'm curious to know whether you think they make sense, or whether they're too long to follow. On the flip side, in several examples I've limited myself to only one class of several or even a single method, rather than including the entire set of classes needed to do something useful. I need to know if these are still comprehensible. As usual all comments are appreciated.

This is the last tutorial chapter on SAX. There'll be one more reference chapter later on. However, right now the first eight chapters form a very solid introductory text about processing XML with SAX2. If anybody notices any important topics in that domain that haven't been covered yet, I'd appreciate hearing about it. The next chapter will begin the coverage of DOM.


IBM's alphaWorks has released the Reengineering Tool Kit for Java, a tool for converting Java source code into XML documents. The XML documents aren't compilable, but they should be much easier to integrate into documentation. In fact, I may take a stab at using this to generate some of the reference appendices for Processing XML with Java. Java 1.3 or later is required. The license is unclear, but the download is free. Update: Anjan Bacchu reports that the license is non-commercial, evaluation only and will expire in 90 days.

Monday, December 3, 2001

Cafe con Leche is now available at http://www.cafeconleche.org/. This is exactly the same site as http://www.ibiblio.org/xml/, just at a different, hopefully easier to remember URL. Both URLs will be live for the foreseeable future.


IBM's Martin Presler-Marshall has published The Platform for Privacy Preferences 1.0 Deployment Guide as an unendorsed W3C Note.

Sunday, December 2, 2001

David Brownell's posted the third beta of SAX2, release 2. Release 2 is a planned bug fix release that corrects some errors in implementation classes like AttributesImpl and cleans up the JavaDoc considerably. In particular, many methods that had unspecified behaviors are now specified. However, it does not change the API at all.


The first beta of DocBook 4.2 has been posted. DocBook is an XML application for technical documentation such as computer books and manuals. I'm writing my next book, Processing XML with Java, in DocBook from start to finish. "This is a backwards-compatible release that incorporates a large number of bug fixes and feature requests."


The DocBook Technical Committee has also posted version 0.1 of the XML Character Entities, a set of DTD modules that provide the entity references for numerous useful non-ASCII characters. These were rpeviously bindled with DocBook, but should now be more useful for other applications as well. These references are one of the standard parts of SGML that got cut from XML. They include:

  • iso-amsa.ent: ISO 8879:1986//ENTITIES Added Latin 1//EN//XML
  • iso-amsb.ent: ISO 8879:1986//ENTITIES Added Math Symbols: Binary Operators//EN//XML
  • iso-amsc.ent: ISO 8879:1986//ENTITIES Added Math Symbols: Delimiters//EN//XML
  • iso-amsn.ent: ISO 8879:1986//ENTITIES Added Math Symbols: Negated Relations//EN//XML
  • iso-amso.ent: ISO 8879:1986//ENTITIES Added Math Symbols: Ordinary//EN//XML
  • iso-amsr.ent: ISO 8879:1986//ENTITIES Added Math Symbols: Relations//EN//XML
  • iso-box.ent: ISO 8879:1986//ENTITIES Box and Line Drawing//EN//XML
  • iso-cyr1.ent: ISO 8879:1986//ENTITIES Russian Cyrillic//EN//XML
  • iso-cyr2.ent: ISO 8879:1986//ENTITIES Non-Russian Cyrillic//EN//XML
  • iso-dia.ent: ISO 8879:1986//ENTITIES Diacritical Marks//EN//XML
  • iso-grk1.ent: ISO 8879:1986//ENTITIES Greek Letters//EN//XML
  • iso-grk2.ent: ISO 8879:1986//ENTITIES Monotoniko Greek//EN//XML
  • iso-grk3.ent: ISO 8879:1986//ENTITIES Greek Symbols//EN//XML
  • iso-grk4.ent: ISO 8879:1986//ENTITIES Alternative Greek Symbols//EN//XML
  • iso-lat1.ent: ISO 8879:1986//ENTITIES Added Latin 1//EN//XML
  • iso-lat2.ent: ISO 8879:1986//ENTITIES Added Latin 2//EN//XML
  • iso-num.ent: ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN//XML
  • iso-pub.ent: ISO 8879:1986//ENTITIES Publishing//EN//XML
  • iso-tech.ent: ISO 8879:1986//ENTITIES General Technical//EN//XML

Adobe's posted a Linux beta of their SVG Viewer browser plug-in.

Saturday, December 1, 2001

David Megginson's posted the first beta of the NewsML Toolkit 1.1, an open source Java library for reading and processing NewsML documents. NewsML 1.0 is a news-industry packaging and metadata standard for exchanging multi-part news and information in multiple media using XML. Version 1.1 "is an extensive rewrite of version 1.0 released last spring: it includes full XPath support, high-level support for formal names and basis-for-choice selection, and an extensible conformance test library. Over 300 unit tests are included." The NewsML Toolkit is published under the LGPL.

Friday, November 30, 2001

The XML Apache Project has released Cocoon 2.0, an open source server side XML framework that supports XSLT and XInclude. Version 2.0 "is a complete rewrite of the first generation that removes all of those design constraints that emerged during almost three years of worldwide use."


Alex Selkirk's updated his XML Schema Toolkit to version 0.12. This is a closed source tool which can convert XML schemas to Visual C++ code.

Thursday, November 29, 2001

Opera Software has released the Opera 6.0 web browser for Windows. Among many other features Opera boasts native support for XML and pretty good support for CSS Level 2. The most important new feature in this release is much better support for Unicode. A Linux beta is also available. Opera is $39.95 payware or free-beer adware (Your choice).


Mark Szpakowski clued me in that the bottom of the Tasks menu in Mozilla lists active windows. Thanks! I hadn't thought to look there.

Wednesday, November 28, 2001

I've installed Mozilla 0.9.6 on both my Mac and my Windows box. My initial impressions are positive. This is an incremental improvement, but not a major step forward. I haven't done speed tests myself, but most people I've heard from say that this release is a lot faster than Netscape 6.2 and somewhat fatser than earlier releases of Mozilla.

Stylesheet wise, the CSS support still needs some work. In particular, CSS tables don't seem to work with XML documents. XSLT support does seem much improved. In particular, for the first time I was able to load both XML documents and XSLT style sheets from the local file system.

Applescript support is somewhat improved. Opening URLs in new windows works again. However, Mozilla still can't figure out which window is in front, so I'm continuing to use Internet Explorer on my Mac. The lack of a Window menu is also a major shortcoming since I tend to start my day by automatically opening a dozen or so windows on diferent news sites, and like to navigate between them.

Tuesday, November 27, 2001

Mozilla 0.9.6 has been released for Windows, Linux, Solaris, OpenVMS, and MacOS. New features include print preview, support for .BMP and .ICO images on all platforms, and a "Search for" item on the context menu.

As has been the case for the last several releases, Mozilla includes full support for XML and CSS (and of course HTML). XSLT support is there too, but it's been quite buggy through 0.9.5, and probably still is. Don't be too surprised if you can't get this working. If you want to try, make sure you load both the XML documents and the XSLT stylesheets from a web server, not the local file system, and that the server assigns both the XML documents and the XSLT stylesheets the MIME media type text/xml.

If you want something smaller than Mozilla, check out Galeon 1.0, a recently released Mozilla-based Web browser for Gnome/Linux that throws away the mail reader, news reader, chat client, and other extraneous fluff.


The Gnu Classpath Extension Project has posted the first beta of GNU JAXP 1.0. This includes the Ælfred XML parser and supports SAX2, DOM2, and the Java API for XML Processing 1.1 (JAXP). The SAX support fixes a number of bugs that are present in most other parsers. GNU JAXP is licensed under the terms of the GNU General Public License, with the "library exception" which permits its use as a library in conjunction with non-Free software.


Sebastian Rahtz has updated PassiveTeX, his XSL-Formatting Objects to TeX converter, to support the XSL 1.0 recommendation from the W3C. The changes from the proposed recommendation to the final recommendation of XSL 1.0 are fairly minor and easy to fix (essentially one key attribute was renamed), but do affect almost everyone who uses XSL-FO.

Monday, November 26, 2001

Wizen Software has released PowerXML Pro 2.1, a payware "dynamic data-integration platform supporting information exchange...between XML and non-XML data sources. Sources include files, databases, applications, web-sites, mainframes, and web-services. PowerXML is based on industry standard XML, XSLT, and XPath technologies." This is a standalone desktop program for visually building and executing dynamic data pipelines. Pipelines can be used to combine, filter, transform, and map data between various data sources and formats. The idea's interesting, but their web site just showed white pages on my Mac. Frankly, I'd be very hesitant to trust XML solutions to a company that hasn't even mastered HTML yet.

Pricing is not published, a fairly standard scam which different companies practice for different purposes. Some vendors like to leave pricing unspecified so they can see how much they can milk out of customers, and not leave anything on the table. Other vendors do have standard pricing, but like to make you invest a lot of time listening to their sales pitch before they'll tell you what their product actually costs. That way they figure you'll be more invested in the product and less likely to rule it out immediately because it's overpriced. It's not clear which technique is in effect here. Windows, NT/2000/XP, Linux, and Solaris are supported.

Sunday, November 25, 2001

From the better late than never department, I noticed that the W3C XML encryption working group last month published last call working drafts of XML Encryption Syntax and Processing and Decryption Transform for XML Signature.

Saturday, November 24, 2001

Republica Corp. has submitted a note to the W3C on the DEL Data Extraction Language. According to the submission, "DEL is an XML format for describing data conversion processes from other data formats to XML. A DEL script specifies how to locate and extract fragments from input data and where to insert them in the resulting XML format. The DEL processor executing the DEL script can use the extracted data to either create a new XML document or modify an existing XML document by creating new elements and attributes at locations specified with XPath expressions." It's not clear if the W3C will do anything with this, but it looks interesting.

Friday, November 23, 2001

The W3C Voice Browser Activity has published the first public working draft of Semantic Interpretation for Speech Recognition. According to the abstract, "This document defines the process of Semantic Interpretation for Speech Recognition and the syntax and semantics of semantic interpretation tags that can be added to speech recognition grammars to compute information to return to an application on the basis of rules and tokens that were matched by the speech recognizer. In particular, it defines the syntax and semantics of the contents of Tags in the Speech Recognition Grammar Specification."

Thursday, November 22, 2001

The W3C/IETF XML Digital Signature Working Group has posted a first and last call working draft of Exclusive XML Canonicalization 1.0. Comments are due by December 11. Accordign to the abstract, "Canonical XML [XML-C14N] recommends a standard means of serializing XML that, when applied to a subdocument, includes its namespace and some other XML context. However, for many applications, it is desirable to have a method which, to the extent practical, excludes such context from a canonicalized subdocument. In particular, where a digital signature over an XML subdocument is needed which will not break when that subdocument is removed from its original document and/or inserted into a different context. The Exclusive XML Canonicalization method described herein provides such a method."

Wednesday, November 21, 2001

Late Night Software's released XMLTools 2.3.2, an expat-based XML parser for AppleScript. This version deliberately breaks compatiblity with other XML parsers by allowing extra white space before the XML declaration. I recommend you do not upgrade.

Late Night Software's also released version 1.0d8 of an XML-RPC library for Classic MacOS systems.

Tuesday, November 20, 2001

I installed Xerces-J 1.4.4 yesterday, and I'm not thrilled with it to say the least. This release did not fix several major, well-known bugs with SAX support. XMLReaderFactory.createXMLReader() still fails to correctly load the Xerces SAXParser class, and the AttributesImpl class is out of date and has at least one nasty bug in addAttribute() that the current SAX distribution has fixed. (I haven't reported this one yet because I can't quite put my finger on the line of code that's ine rror; but I have proved to my satisfaction that swapping the current SAX AttributesImpl class for the Xerces AttributesImpl class does fix the problem.) Worse yet, the DOM code reintroduces a bug in cloneNode() (inability to clone documents) that had been fixed in Xerces-J 1.4.0. Among other things, this breaks my DOMXIncluder. I've downgraded to Xerces-J 1.4.3 for the moment.

Monday, November 19, 2001

The NetBeans XML team has posted the the second Alpha release of the NetBeans XML modules family. NetBeans is a modular Integrated Development Environment (IDE) written in Java. Features include:

  • A text editor with syntax coloring and code completion,
  • A tree editor with customizable filtering views,
  • A well-formedness and validity checker

NetBeans 3.3 beta 3 or later is required.

Sunday, November 18, 2001

The W3C RDF Core Working Group has published a new working draft RDF Test Cases which describes "a set of machine-processable test cases corresponding to technical issues addressed by the WG."

Saturday, November 17, 2001

Sun's released the Java XML Pack, an all-in-one bundle of various Java technologies for XML. This release (Fall 2001) includes the Java API for XML Processing (JAXP), the Crimson XML parser 1.1.3, the Xalan XSLT processor, and the Java API for XML Messaging (JAXM).

Sun's also posted the final release of the JAXM 1.0 Specification. JAXM is a Java API for SOAP clients and servers.

Finally, Sun's posted the second public review draft of the Java API for XML Registries (JAXR) specification. JAXR "provides a uniform and standard Java API for accessing different kinds of XML Registries. An XML registry is an enabling infrastructure for building, deploying, and discovering web services."

Friday, November 16, 2001

The Xerces-J team of the XML Apache Project has released version 1.4.4 of Xerces-J, an open source, schema validating XML parser written in Java. Xerces-J supports SAX2, DOM2, and JAXP. This is a bug fix release. All users should upgrade.


The W3C has published a candidate recommendation of Selectors (formerly CSS Selectors). "This document is one of the 'modules' of the upcoming CSS3 specification. It not only describes the selectors that already exist in CSS1 and CSS2, but also proposes new selectors for CSS3 as well as for other languages that may need them. The CSS Working Group doesn't expect that all implementations of CSS3 will have to implement all selectors. Instead, there will probably be a small number of variants of CSS3, so-called 'profiles'. For example, it may be that only a profile for non-interactive user agents will include all of the proposed selectors." There are a lot of new features here relative to CSS2, most importantly for XML namespace based selectors. Much of this material is discussed in Chapter 17 of the XML Bible, Gold Edition. The Candidate Recommendation Phase ends May, 2002.


Waterloo Maple has released version 7.0 of the Maple symbolic mathematics package. New features include TCP/IP, MathML 2.0, and "substantial increases in the depth and breadth of solution algorithms in key areas such as differential equations and numerical computation. It continues to lead with greater integration of key technology from the Numerical Algorithms Group (NAG) and other respected sources". Maple is $1695 payware and available on Windows and Linux.


Allin Cottrell has published dbtexmath, a set of Perl and DSSSL files that enables "literal pass-through of TeX math to jadetex, in the context of DocBook"; that is, it lets you use TeX math rather than MathML in the SGML/XML source file. This includes a utility to auto-generate PNG images for use in HTML.

Thursday, November 15, 2001

Michael Kay's released version 6.5 of his open source SAXON XSLT processor written in Java. This is my current XSLT processor of choice. This release fixes a few bugs, allows you to run SAXON in a secure mode where Java extension functions are disabled, and requires setting version="1.1" to enable XSLT 1.1 features such as xsl:document, xsl:script, and the ability to refer to a result tree fragment as it it were a node-set.

Wednesday, November 14, 2001

The XML:DB API is an attempt to bring an ODBC or JDBC style access API to native XML databases. The API is intended to be language independent, but most early work has been done in Java. The reference implementation is basically a very simple file system based native XML database and is an easy way to get familiar with the API. Full source code is available under an Apache style license. Included with the reference implementation are some Java driver development tools, including a set of base driver classes and an early release of a test suite. Java implementations of the XML:DB API exist for dbXML and eXist and are under development for Ozone.


Opera Software ASA has posted the first beta of version 6.0 of their namesake Opera web browser for Windows. Version 6.0 focuses on prettifying the user interface including a single document interface. It also offers an option to turn off pop-up windows. Opera has suported direct display of XML and CSS since version 4.0. Opera is $39 payware or free-beer ad-ware (your choice). Upgrades from version 5.0 are free.


IBM's updated the XML Toolkit for z/OS and OS/390 with improved XML parsing capabilities including namespace, JAXP, and schema support. The LotusXSL XSL Transformations (XSLT) processor has also been added to this release.

Tuesday, November 13, 2001

XYZFind has released version 2 of the XYZFind Server native XML database. This latest release includes significant improvements to the XML indexing and retrieval functionality first introduced earlier this year. Performance, scalability, and query capabilities are enhanced with this current version of the server, which retains the ability to adapt to new or modified existing XML document types without dependence on explicit schema or DTD information. Version 2.0 can run as a Windows Service, improves indexing and query performance, and extends the web-based administrative console. XML document round tripping is significantly improved, and keyword search, Boolean operators, wildcard queries, numeric range queries and stop word support are all available. Interfaces include XML over HTTP, a Java API and SOAP 1.1. XYZFind Server currently ships on Linux, Solaris, and Windows NT/2000/XP. XYZFind is very close-mouthed about what this actually costs, which I normally interpret as meaning "the price goes up or down depending on how much the salespeople think they can talk you into paying."


SportsML, the Sports Markup Language, is an XXML application designed by the International Press Telecommunications Council, (IPTC) for exchanging sports data among news publishers. Data provided in SportsML includes scores, schedules, standings, and statistics for a wide variety of sports.


Henry S. Thompson has released version 1.4 of the XSV W3C XML Schema Language validator. This release "introduces a cheap-but-effective approximation to enforcement of the constraints on derivation by restriction for complex type definitions. This works by enforcing the subset invariant, and rejects any content model for a type definition derived by restriction which allows anything _not_ allowed by the base type definition's content model. This is slightly weaker than the REC: i.e. everything it rules out is ruled out by the REC, but a few things ruled out by the REC will not be caught." XSV is published under the GPL.


I've posted the initial batch of web pages for the XML Bible, Gold Edition. The smaple chapters are pretty much the same ones from the second edition. The main new feature of interest here is the examples, which cover several new topics including:

  • DTD modularization
  • XHTML 1.1
  • SMIL 2.0
  • XML Base
  • Canonical XML
  • RDDL

I was a bit bleary-eyed when I was putting these together last night, so there are doubtless a few broken links and missing examples. Please let me know of any you find so I can fix them. Thanks!

Monday, November 12, 2001

I've posted The XMLReader Interface, Chapter 7 of Processing XML with Java, here on Cafe con Leche. This chapter covers:

  • Locating XML parsers with XMLReaderFactory
  • The different kinds of SAXException
  • Reading a document from an InputSource
  • Using EntityResolver to substitute DTD modules and other entities
  • Reading DTDs with DeclHandler
  • Accessing unparsed entities and notations through DTDHandler
  • Lexical events reported by the LexicalHandler
  • Configuring the parser with features and properties

I've still got half the book to go, but I'd venture that this book is definitely starting to be seriously useful, and is a solid introduction to XML parsing for Java developers. In particular, It could certainly be used as a text for an XML course that focused on basic XML and SAX. One more chapter on SAX filters remains to be written. After that, it's on to DOM.

Sunday, November 11, 2001

The W3C HTML Working Group has posted the last call working draft of XML Events, The "module defined in this specification provides XML languages with the ability to uniformly integrate event listeners and associated event handlers with Document Object Model (DOM) Level 2 event interfaces [DOM2]. The result is to provide an interoperable way of associating behaviors with document-level markup." Most of the changes since the last draft are fairly minor. Comments are due by November 30.

Saturday, November 10, 2001

As readers of any of my XML books know, I have about three standard examples of processing instructions: php, cocoon, and robots. Thus I was a little disturbed to discover while working on an update yesterday that the original proposal for the robots processing instruction by Walter Underwood (http://homepages.go.com/~wunder0/robots-pi.html) has vanished from the net. Furthermore, Mr. Underwood's old e-mail at wunder@infoseek.com is bouncing. Walter, if you're out there could you please drop me an e-mail at elharo@ibiblio.org with your current address? Or if anybody knows where I can reach him or where the robots page might have moved to, could you please let me know? I'd like to ask some questions about the current status and location of this proposal. Thanks.


xlinkit 4.3 has been released. xlinkit is a technology for expressing constraints between multiple heterogeneous, distributed documents and data sources, based on XML. xlinkit evaluates constraints and produces XLink hyperlinks between inconsistent elements. Version 4.3 features a plug-in architecture for defining new operators dynamically in Javascript to match even more heterogeneous documents. Simple equality can become fuzzy equality or a regular expression match, and entire subtree-equality can be checked in XML documents. xlinkit is patented and not open source, vendor claims to the contrary notwithstanding.

Friday, November 9, 2001

IDEALX S.A.S. is developing getox, GPL'd XML source code editor for Gnome/Linux. Feedback and contributions are solicited.


The Insitute of Medical Informatics, University of Giessen has released XSBROWSER 2.0, an open source Java program that tries to create human readable document models from a given DTD or W3C XML Schema. XSBROWSER exploits human readable material contained in DTD comments and XML Schema documentation. New features since version 1.0 include:

  • A simple XML editor
  • Faster performance
  • More comprehensive documentation
  • DTD comments are mapped onto XML Schema documentation.
  • Documentation of XML markup/values is sorted in ascending order.
  • More examples

The W3C Voice Browser Activity has published the first public working draft of Voice Extensible Markup Language (VoiceXML) Version 2.0. "VoiceXML is designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed-initiative conversations. Its major goal is to bring the advantages of web-based development and content delivery to interactive voice response applications." The voice space is infected by patents from numerous parties. According to the draft,

This document seeks Member and public comment on both the technical design and the patent licensing issues arising out of the disclosure and licensing statements that have been made. Our decision to publish this first public working draft has been made to secure early comments from the community, but does not imply that all questions of patent licensing have been resolved or clarified. They must be resolved or work on this document in W3C will stop. As things stand at the time of publication of this specification, implementations conforming to this specification may require royalty bearing licenses for essential IPR. Further information can be found in the patent disclosures page. The patent policy for W3C as a whole is under wide discussion. A set of commitments by all participants in the Voice Browser Activity to royalty free is a possibility for the future but has NOT been made at time of publication.

I've updated the XML Conferences list with the conferences I know about from now through next June. I'm planning to speak to XMLOne in London in March and Software Development in San Jose in April. I'm talking to a few of the other shows, but nothing's definite yet.

Thursday, November 8, 2001

Evan Lenz has announced TransQuery, a "small, flexible set of XSLT conventions and processing model constraints that enable the use of XSLT as a query language over multiple XML documents. It is an interoperability specification for XML databases, allowing them to use a standard XML query language today--the W3C-recommended XSLT." It includes XSLT solutions to 76 of the 78 use cases in the W3C XML Query Use Cases document.


The SVG Team at Adobe has releeased the Adobe SVG Viewer 3.0, a free-beer browser plug-in for Netscape and Internet Explorer on Windows and the Mac that supports the SVG 1.0 specification. New features in this release include:

  • Support for the following elements: color-profile, marker, title, and view
  • full support for the switch element; support for the general requiredFeatures and systemLanguage attributes; the image element now supports links to SVG files.
  • Support for the following CSS properties: color-interpolation-filters, color-profile, marker, marker-end, marker-mid, marker-start; support for the @media CSS rule, and the media attribute for style elements. The values all, screen, and print are supported.
  • DOM support for the SVGMatrix class and the getCTM(), getElementsByTagNameNS(), and getBBox() methods
  • Contextual menus can now be modified by the developer for more customized applications
  • An internal script engine allows self-contained JavaScript scripts to run inside SVG files embedded in hosts that don't support a bridge between plug-ins and the host script engine, including Internet Explorer on the Mac.
  • Native support for MacOS X 10.1 and Windows XP
  • Significant performance improvements
  • Can save as compressed SVG (.svgz).

RenderX has released version 2.7 of XEP, a payware XSL-FO to PDF converter. This release upgrades XSL-FO support to the final 1.0 syntax. (Previous versions support the candidate recommendation syntax.) A stylesheet to convert XSL CR documents to XSL 1.0 is included. Other improvements include:

  • orphans and widows
  • The start-indent and end-indent now conform to the spec
  • space resolution rule treatment is more consistent
  • PDF compression;
  • PDF security/encryption;
  • Type1 fonts in PFB format;
  • TrueType font subsetting.
  • A new logging/monitoring interface, based on SAX
  • Documentation reformatted in DocBook
  • Various speedups and bugfixes
Tuesday, November 13, 2001

XYZFind has released version 2 of the XYZFind Server native XML database. This latest release includes significant improvements to the XML indexing and retrieval functionality first introduced earlier this year. Performance, scalability, and query capabilities are enhanced with this current version of the server, which retains the ability to adapt to new or modified existing XML document types without dependence on explicit schema or DTD information. Version 2.0 can run as a Windows Service, improves indexing and query performance, and extends the web-based administrative console. XML document round tripping is significantly improved, and keyword search, Boolean operators, wildcard queries, numeric range queries and stop word support are all available. Interfaces include XML over HTTP, a Java API and SOAP 1.1. XYZFind Server currently ships on Linux, Solaris, and Windows NT/2000/XP. XYZFind is very close-mouthed about what this actually costs, which I normally interpret as meaning "the price goes up or down depending on how much the salespeople think they can talk you into paying."


SportsML, the Sports Markup Language, is an XXML application designed by the International Press Telecommunications Council, (IPTC) for exchanging sports data among news publishers. Data provided in SportsML includes scores, schedules, standings, and statistics for a wide variety of sports.


Henry S. Thompson has released version 1.4 of the XSV W3C XML Schema Language validator. This release "introduces a cheap-but-effective approximation to enforcement of the constraints on derivation by restriction for complex type definitions. This works by enforcing the subset invariant, and rejects any content model for a type definition derived by restriction which allows anything _not_ allowed by the base type definition's content model. This is slightly weaker than the REC: i.e. everything it rules out is ruled out by the REC, but a few things ruled out by the REC will not be caught." XSV is published under the GPL.


I've posted the initial batch of web pages for the XML Bible, Gold Edition. The smaple chapters are pretty much the same ones from the second edition. The main new feature of interest here is the examples, which cover several new topics including:

  • DTD modularization
  • XHTML 1.1
  • SMIL 2.0
  • XML Base
  • Canonical XML
  • RDDL

I was a bit bleary-eyed when I was putting these together last night, so there are doubtless a few broken links and missing examples. Please let me know of any you find so I can fix them. Thanks!

Monday, November 12, 2001

I've posted The XMLReader Interface, Chapter 7 of Processing XML with Java, here on Cafe con Leche. This chapter covers:

  • Locating XML parsers with XMLReaderFactory
  • The different kinds of SAXException
  • Reading a document from an InputSource
  • Using EntityResolver to substitute DTD modules and other entities
  • Reading DTDs with DeclHandler
  • Accessing unparsed entities and notations through DTDHandler
  • Lexical events reported by the LexicalHandler
  • Configuring the parser with features and properties

I've still got half the book to go, but I'd venture that this book is definitely starting to be seriously useful, and is a solid introduction to XML parsing for Java developers. In particular, It could certainly be used as a text for an XML course that focused on basic XML and SAX. One more chapter on SAX filters remains to be written. After that, it's on to DOM.

Sunday, November 11, 2001

The W3C HTML Working Group has posted the last call working draft of XML Events, The "module defined in this specification provides XML languages with the ability to uniformly integrate event listeners and associated event handlers with Document Object Model (DOM) Level 2 event interfaces [DOM2]. The result is to provide an interoperable way of associating behaviors with document-level markup." Most of the changes since the last draft are fairly minor. Comments are due by November 30.

Saturday, November 10, 2001

As readers of any of my XML books know, I have about three standard examples of processing instructions: php, cocoon, and robots. Thus I was a little disturbed to discover while working on an update yesterday that the original proposal for the robots processing instruction by Walter Underwood (http://homepages.go.com/~wunder0/robots-pi.html) has vanished from the net. Furthermore, Mr. Underwood's old e-mail at wunder@infoseek.com is bouncing. Walter, if you're out there could you please drop me an e-mail at elharo@ibiblio.org with your current address? Or if anybody knows where I can reach him or where the robots page might have moved to, could you please let me know? I'd like to ask some questions about the current status and location of this proposal. Thanks.


xlinkit 4.3 has been released. xlinkit is a technology for expressing constraints between multiple heterogeneous, distributed documents and data sources, based on XML. xlinkit evaluates constraints and produces XLink hyperlinks between inconsistent elements. Version 4.3 features a plug-in architecture for defining new operators dynamically in Javascript to match even more heterogeneous documents. Simple equality can become fuzzy equality or a regular expression match, and entire subtree-equality can be checked in XML documents. xlinkit is patented and not open source, vendor claims to the contrary notwithstanding.

Friday, November 9, 2001

IDEALX S.A.S. is developing getox, GPL'd XML source code editor for Gnome/Linux. Feedback and contributions are solicited.


The Insitute of Medical Informatics, University of Giessen has released XSBROWSER 2.0, an open source Java program that tries to create human readable document models from a given DTD or W3C XML Schema. XSBROWSER exploits human readable material contained in DTD comments and XML Schema documentation. New features since version 1.0 include:

  • A simple XML editor
  • Faster performance
  • More comprehensive documentation
  • DTD comments are mapped onto XML Schema documentation.
  • Documentation of XML markup/values is sorted in ascending order.
  • More examples

The W3C Voice Browser Activity has published the first public working draft of Voice Extensible Markup Language (VoiceXML) Version 2.0. "VoiceXML is designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed-initiative conversations. Its major goal is to bring the advantages of web-based development and content delivery to interactive voice response applications." The voice space is infected by patents from numerous parties. According to the draft,

This document seeks Member and public comment on both the technical design and the patent licensing issues arising out of the disclosure and licensing statements that have been made. Our decision to publish this first public working draft has been made to secure early comments from the community, but does not imply that all questions of patent licensing have been resolved or clarified. They must be resolved or work on this document in W3C will stop. As things stand at the time of publication of this specification, implementations conforming to this specification may require royalty bearing licenses for essential IPR. Further information can be found in the patent disclosures page. The patent policy for W3C as a whole is under wide discussion. A set of commitments by all participants in the Voice Browser Activity to royalty free is a possibility for the future but has NOT been made at time of publication.

I've updated the XML Conferences list with the conferences I know about from now through next June. I'm planning to speak to XMLOne in London in March and Software Development in San Jose in April. I'm talking to a few of the other shows, but nothing's definite yet.

Thursday, November 8, 2001

Evan Lenz has announced TransQuery, a "small, flexible set of XSLT conventions and processing model constraints that enable the use of XSLT as a query language over multiple XML documents. It is an interoperability specification for XML databases, allowing them to use a standard XML query language today--the W3C-recommended XSLT." It includes XSLT solutions to 76 of the 78 use cases in the W3C XML Query Use Cases document.


The SVG Team at Adobe has releeased the Adobe SVG Viewer 3.0, a free-beer browser plug-in for Netscape and Internet Explorer on Windows and the Mac that supports the SVG 1.0 specification. New features in this release include:

  • Support for the following elements: color-profile, marker, title, and view
  • full support for the switch element; support for the general requiredFeatures and systemLanguage attributes; the image element now supports links to SVG files.
  • Support for the following CSS properties: color-interpolation-filters, color-profile, marker, marker-end, marker-mid, marker-start; support for the @media CSS rule, and the media attribute for style elements. The values all, screen, and print are supported.
  • DOM support for the SVGMatrix class and the getCTM(), getElementsByTagNameNS(), and getBBox() methods
  • Contextual menus can now be modified by the developer for more customized applications
  • An internal script engine allows self-contained JavaScript scripts to run inside SVG files embedded in hosts that don't support a bridge between plug-ins and the host script engine, including Internet Explorer on the Mac.
  • Native support for MacOS X 10.1 and Windows XP
  • Significant performance improvements
  • Can save as compressed SVG (.svgz).

RenderX has released version 2.7 of XEP, a payware XSL-FO to PDF converter. This release upgrades XSL-FO support to the final 1.0 syntax. (Previous versions support the candidate recommendation syntax.) A stylesheet to convert XSL CR documents to XSL 1.0 is included. Other improvements include:

  • orphans and widows
  • The start-indent and end-indent now conform to the spec
  • space resolution rule treatment is more consistent
  • PDF compression;
  • PDF security/encryption;
  • Type1 fonts in PFB format;
  • TrueType font subsetting.
  • A new logging/monitoring interface, based on SAX
  • Documentation reformatted in DocBook
  • Various speedups and bugfixes
Wednesday, November 7, 2001

The Apache Cocoon team has posted the second release candidate of Apache Cocoon 2, a "complete rewrite of the Cocoon XML publishing framework that is supposed to remove all those design constraints that emerged from the Cocoon 1 experience."


Paul Prescod and ActiveState have launched the XSLT Cookbook. The idea is to get people to contribute "recipes" that other people could then take and use in their programs. In the case of the XSLT Cookbook, we are of course talking about XSLT snippets to be used in stylesheets and transformations.


XSmiles 0.45, an open source XML browser written in Java, has been posted. Some support is included for XML Parsing, XSLT Transforms, SMIL 2.0 Basic, XForms, XSL Formatting Objects, SVG, XML Events, ECMAScript, skins, and SIP Videoconferencing. This is still very much a work in progress, but is interesting nonetheless. The very different approach to XML browsing compared to converted HTML browsers like Mozilla is very suggestive.


Peter Flynn's XML FAQ has been translated into Amharic. The FAQ is presented in different document formats including HTML, PDF, and Postscript. You will need an Ethiopic Unicode font such as Jiret to read the HTML (UTF-8) format.


The Netbeans XML team has posted the first alpha of the XML modules for NetBeans, an open source integrated development environment (IDE). NetBeans 3.3 Beta 3 or later is required. Features include an XML text editor with syntax coloring and code completion, an XML tree editor, validation.


SSAX is a purely functional, semi-validating SAX/DOM/SXML parser/library written entirely in a pure-functional subset of Scheme. SSAX consists of a DOM/SXML parser, a SAX parser, and a supporting library of lexing and parsing procedures. A SSAX parser is a full-featured, algorithmically optimal, pure-functional parser, which can act as a stream processor. Scheme. The complete source code as well as benchmarks and examples have been placed in the public domain.

Tuesday, November 6, 2001

The dbXML project has posted what they hope is the final beta release of the dbXML Core XML database, an open source, native XML database designed to manage large collections of small XML documents. The server supports XPath queries and provides an implementation of the XML:DB XML Database API for development of client applications. The source code has been released under an Apache style open source license. Changes in this release are minor and revolve around bug fixes for enhanced stability and scalability. Interoperation with other software, in particular servlet engines such as Tomcat 4.0, has also been improved.


Adobe has released version 10 of Adobe Illustrator, a $399 payware drawing program that exports and, new in version 10, imports Scalable Vector Graphics pictures. Other new features include native support for MacOS X, vector and raster-based slicing, symbols for repeating graphics, drawing tools for lines, arcs, grids and polar grids, a flare tool for reflections and lens effects, a magic wand tool and much more. Upgrades are $149.

Monday, November 5, 2001

Version 1.0 of the XSLT Standard Library, an open source (LGPL) collection of commonly-used templates written purely in XSLT, has been released. Currently the library includes templates for manipulating various kinds of text content including dates, strings, and URIs.


The W3C DOM Activity has released three new and updated Document Object Model Working drafts:

Document Object Model (DOM) Level 3 XPath Specification
This specification defines how one can extract node sets from a DOM Document using XPath 1.0 expressions. Like most everything else in DOM, the design is confusing and excessively complicated with multiple levels of indirection. The key interface is XPathEvaluator, which should give you the rough flavor of this API:
package org.w3c.dom.xpath;

public interface XPathEvaluator {
    public XPathExpression createExpression(String expression, 
                                            XPathNSResolver resolver)
                                            throws XPathException, DOMException;

    public XPathResult createResult();

    public XPathNSResolver createNSResolver(Node nodeResolver);

    public XPathResult evaluate(String expression, 
                                Node contextNode, 
                                XPathNSResolver resolver, 
                                short type, 
                                XPathResult result)
                                throws XPathException, DOMException;

}
Document Object Model (DOM) Level 3 Abstract Schemas and Load and Save Specification

The first part of this working draft defines how schemas in a variety of schema languages including DTDs and the W3C XML Schema Language can be represented in DOM. This is a fairly major update. Among other changes, it adds datatype constants representing all of the W3C XML Schema Language predefined primitive data types. Unfortunately the API does not feel rich enough to support user-defined data types. I'm also concerned that the API only make this information available from the schema model and not from the document model. That is, given an element declaration you can ask what type it has; but you can't ask the same question of an element instance.

The second part of this working draft defines how a program locates a parser, creates new Document objects, and serializes documents onto streams.

Document Object Model (DOM) Level 2 HTML Specification
This specification defines an IDL-based platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content and structure of HTML 4.0 and XHTML 1.0 documents. DOM Level 2 HTML builds on the DOM Level 2 Core.

James Clark's Jing is a Jing a validator for RELAX NG implemented in Java on top of SAX2. The latest release adds support for pluggable datatype libraries using the vendor-independent RELAX NG datatype library interface from the relaxng project at SourceForge.


James Clark's DTDinst is a Java program that converts XML DTDs to XML instance format, either RELAX NG or a DTDinst-specific format which should be easily transformable into other schema languages. The key feature of DTDinst is its handling of parameter entities. It is able to reliably turn parameter entity declarations and references into a variety of higher-level semantic constructs.

Sunday, November 4, 2001

Daisuke Okajima's RelaxNGCC 0.4 is a GPL'd tool that generates Java source code from a RelaxNG grammar.


Jasc Software has posted the fifth preview release of WebDraw, a native SVG authoring program for Windows. Major new features and enhancements in Preview Release 5 include:

  • Updated support to the SVG 1.0 Recommendation
  • The Animation Timeline palette for creating and editing basic animation statements on SVG objects
  • Interface enhancements for more control over object editing, creation and selection
  • Separated the Overview navigation feature into a separate palette for easier navigating on zoomed images
  • A Preview tab to the main document window to view the current file in a Web browser environment without leaving WebDraw
  • Split the former Line tool out into separate Line, Polygon, Freehand and Path tools for creating line, polygon, polyline and path objects
  • An Image tool for adding external bitmap and SVG images to a file
  • Different unit types on attribute and property values
  • Enhanced the Text tool with on-screen entry of new text elements
  • More accurate rendering of text and tspan elements
  • Quadratic Bezier and elliptical arc path segments

Mike Brown's released the Pretty XML Tree Viewer, an XSLT stylesheet that produces an HTML+CSS1 representation of an XML document's XPath/XSLT node structure.


Daniel Veillard has released libxml2 2.4.7 and libxslt 1.0.6, the Gnome XML and XSLT libraries. Among other things, they provide the xsltproc command. These are mostly bug fix releases.


Alexandr Korlyukov has written a Lambda-calculus interpreter in XSLT. The practical use of this is minimal, but this does constitute a proof that XSLT is Turing complete.

Saturday, November 3, 2001

James Strachan released version 1.1 of dom4j, an open source library for working with XML, XPath and XSLT on the Java platform using the Java Collections Framework. This release adds support for DTD declarations and "numerous patches, optimizations and bug fixes".

Friday, November 2, 2001

The W3C has released the official XML Infoset Recommendation. There do not appear to be any substantive changes since the proposed recommendation. This started out as a standard data model for all XML specifications, but got demoted to nothing more than a standard vocabulary for talking about XML. It gives names for the different parts of an XML do (most of which already had names in XML 1.0 anyway). There's not a lot of practical information here. The Infoset doesn't let you do anything you couldn't do before now. A few other specs such as XInclude are likely to be defined in terms of Infoset transformations instead of text document transformations, but that's about it.

Thursday, November 1, 2001

The W3C has released version 5.2 of the Amaya web browser for Unix and Windows. Amaya supports HTML 4.01, XHTML 1.0, XHTML Basic, XHTML 1.1, HTTP 1.1, MathML 2.0, much of CSS Level 2, and some parts of SVG. It also includes an annotation application based on XPointer and RDF. It's not quite clear from the online documentation, but it looks probable that this release might add support for XML documents with CSS style sheets. I'm downloading it now to test that out. Update: XML files with attached CSS style sheets can be displayed directly in the browser. However, there are definitely some holes in CSS support.

In many ways, Amaya is a very nice browser/editor. However, the W3C mostly sees this as a test project, not as a serious end-user tool to compete with the likes of IE, Opera, and Mozilla. Consequently, the user interface is ugly, and extremely quirky and non-standard. I don't know how development on Amaya normally proceeds, but it might be worth somebody's time to do some work on the user interface, forking the project if necessary.

Wednesday, October 31, 2001

Netscape has released version 6.2 of their namesake web browser for Windows (now including Windows XP), the Mac (now including MacOS X), and Linux. This release is based on Mozilla 0.94 and adds support for multiple e-mail accounts, Password, Cookie and Form Managers, improved speed, stability and themes, integrated AOL Instant Message functions and many other enhancements. It supports XML styled with CSS or XSLT, though XSLT support is still pretty buggy.

On my Windows and Linux boxes I'm already happily using Mozilla 0.95. However, I'm dying to upgrade to Netscape 6.x and/or Mozilla on my Mac for my day-to-day browsing, but I really can't until they fix the bugs in their AppleScript support. :-(


The W3C SVG Working Group has posted the first public working draft of Scalable Vector Graphics 1.1. This release modularizes SVG 1.0 as well as adding a number of new elements and attributes. One much needed feature under consideration for this release but not yet finished is text wrapping. Other possibilities include allowing shapes to be described using viewport coordinates so that things like map legends could be statically placed within the windows while being unaffected by pan and zoom. It's not clear whether or not real-world coordinates like meters and miles are being considered.

The working group has also published two SVG 1.1 profiles, SVG Tiny and SVG Basic, together known as Mobile SVG. SVG Tiny is a subset of full SVG 1.1 designed for cell phones and low-end PDAs. SVG Basic is designed for high-end PDAs.

Finally, the working group has published the third release of the SVG 1.0 test suite and associated implementation report. According to this the Adobe SVG Viewer plug-in 3.0 public beta is currently the most complete implementation, while the Apache XML project's Batik is running second. However, nobody is completely conformant to SVG 1.0 yet.


Sun's posted the maintenance review draft specification of the Java API for XML Processing 1.1 in the Java Community Process. The goal for this release is to add standard schema support to JAXP by defining two new SAX properties, http://java.sun.com/xml/jaxp/properties/schemaLanguage and http://java.sun.com/xml/jaxp/properties/schemaLocation. This seems like a reasonable proposal, though I'd prefer to see it done as a part of the next iteration of SAX rather than JAXP (e.g. http://xml.org/sax/properties/schemaLanguage and http://xml.org/sax/properties/schemaLocation). I also think they need a third http://java.sun.com/xml/jaxp/properties/noNamespaceSchemaLocation property. Comments are due by December 3.

Tuesday, October 30, 2001

ElCel Technology has released version 0.14 of their free-beer XML Validator and Canonical XML Processor command-line tools. This release adds support for basic HTTP authentication and FTP URLs, full Unicode character range under Windows, and some bug fixes.


Load 2.0.1 is an open-source test utility that can check Web applications and SOAP-based Web Services for performance and scalability. Load includes an XML-based scripting language and library of test object to let one build intelligent test agents. The agents can then be run concurrently to test Web apps in near-real production environments. Load uses JDOM to parse and build XML documents.


The SIA parser for XML sits on top of SAX. but additionally provides global state recognition automaton so it can uniquely identify any node within parsed XML.


sync4j is an open source Java implementation of the SyncML data synchronization protocol for PDAs and similar devices. sync4j uses JDOM for XML manipulation.


Media Design in*Progress's Interaction is a $795 payware XML based application server for Mac. The recently released 3.6 version performs server-side XSL Transformations and upgrades its style sheets editor to cover CSS Level 2. Interaction generates HTML from server-side XML and XSLT/CSS style sheets. The application integrates a native XML database and a document-type driven markup editor. It provides a component architecture and API for third party developers.


Sybase has submitted a Java Specification Request for ebXML CPP/A APIs for Java to the Java Community Process. This JSR aims to provide a standard set of APIs for representing and manipulating Collaboration Profile and Agreement information described by ebXML Collaboration Protocol Profile/Agreement documents. Comments are due by November 5.


Hewlett Packard's submitted a Java Specification Request for an XML Transactioning API for Java (JAXTX) to the Java Community Process. JAXTX proposes an API for packaging and transporting ACID transactions (as in JTA) and extended transactions (e.g., the BTP from OASIS) using the protocols being defined by OASIS and the W3C. Comments are due by November 5.


The Xerces-J team has posted the third beta of Xerces2-J, and open source XML parser for Java. This release fixes a number of bugs, introduces some changes to the Xerces Native Interface, and is the first Xerces2 release to include W3C XML Schema language validation.

Monday, October 29, 2001

Version 1.12 of SVG.pm, a Perl module for generating Scalable Vector Graphics pictures, has been released. Version 1.12 adds a lot of documentation and increases autoload quality.


Recordare's posted a beta version 0.5 of MusicXML, an XML application for common Western music notation used in printed sheet music. Recordare's also published a beta MusicXML Finale Converter that supports file exchange through MusicXML between Finale, SharpEye Music Reader, and the MuseData format. The MusicXML Finale Converter runs on Windows as a plug-in for Finale 2000, 2001, and 2002.


Ipedo's released version 2.0 of their expensive (pricing starts at $995) namesake Ipedo native XML database for Windows, Unix, and Linux. New features in this version include:

  • Free-form XML search
  • Scalable Vector Graphics (SVG) management
  • Distributed database management
  • Integrated, in database XSL transformations
  • Large document (up to one gigabyte) processing

JBrix has posted the first public release of Xybrix, an XML application framework in written in Java. It includes a Swing-based non-standard variant of XForms, XML-definable applications with file and window management, unlimited undo/redo of XML mutations, a graphical form designer component, and a speech plugin that enables Xybrix applications to be controlled by voice through the Java Speech API. Xybrix is distributed under the MIT open source license.


Altova's released XML Spy 4.1, a $399 payware XML editor for Windows. Version 4.0 was roundly criticized. I found it to be almost totally non-functional. Hopefully, most of the bugs have been fixed in this release. However, the announcement from Altova focused on new features instead including support for XSL Formatting Objects (XSL-FO) using FOP, more complete schema support, ODBC and ADO database access, a third party Plug-in architecture, and a new Windows XP look-and-feel. Upgrades are free for 4.0 purchasers, and way too expensive for purchasers of earlier versions. ($89 to $369 depending on version).


X-Hive Corporation has posted an online demo of its XQuery implementation backed by its X-Hive/DB native XML database.


The W3C Cascading Style Sheets (CSS) Working Group has published the CSS Mobile Profile 1.0 Candidate Recommendation. This specification defines a subset of the CSS2 tailored to the needs and constraints of mobile devices such as cell phones and Palm Pilots.


The Mind Electric has released Electric XML+ 3.0, a free-beer alternative API/class library for parsing and manipulating XML documents. Electric XML+ is a superset of Electric XML that includes the following additional features:

  • Transparent, bidirectional, Java to XML serialization.
  • Command-line tools for generating Java from XML schemas and XML schemas from Java.
  • An annotated schema system allows default mappings to be overridden without coding.
  • Transactional persistence for storing Java objects as XML documents
Wednesday, October 31, 2001

Netscape has released version 6.2 of their namesake web browser for Windows (now including Windows XP), the Mac (now including MacOS X), and Linux. This release is based on Mozilla 0.94 and adds support for multiple e-mail accounts, Password, Cookie and Form Managers, improved speed, stability and themes, integrated AOL Instant Message functions and many other enhancements. It supports XML styled with CSS or XSLT, though XSLT support is still pretty buggy.

On my Windows and Linux boxes I'm already happily using Mozilla 0.95. However, I'm dying to upgrade to Netscape 6.x and/or Mozilla on my Mac for my day-to-day browsing, but I really can't until they fix the bugs in their AppleScript support. :-(


The W3C SVG Working Group has posted the first public working draft of Scalable Vector Graphics 1.1. This release modularizes SVG 1.0 as well as adding a number of new elements and attributes. One much needed feature under consideration for this release but not yet finished is text wrapping. Other possibilities include allowing shapes to be described using viewport coordinates so that things like map legends could be statically placed within the windows while being unaffected by pan and zoom. It's not clear whether or not real-world coordinates like meters and miles are being considered.

The working group has also published two SVG 1.1 profiles, SVG Tiny and SVG Basic, together known as Mobile SVG. SVG Tiny is a subset of full SVG 1.1 designed for cell phones and low-end PDAs. SVG Basic is designed for high-end PDAs.

Finally, the working group has published the third release of the SVG 1.0 test suite and associated implementation report. According to this the Adobe SVG Viewer plug-in 3.0 public beta is currently the most complete implementation, while the Apache XML project's Batik is running second. However, nobody is completely conformant to SVG 1.0 yet.


Sun's posted the maintenance review draft specification of the Java API for XML Processing 1.1 in the Java Community Process. The goal for this release is to add standard schema support to JAXP by defining two new SAX properties, http://java.sun.com/xml/jaxp/properties/schemaLanguage and http://java.sun.com/xml/jaxp/properties/schemaLocation. This seems like a reasonable proposal, though I'd prefer to see it done as a part of the next iteration of SAX rather than JAXP (e.g. http://xml.org/sax/properties/schemaLanguage and http://xml.org/sax/properties/schemaLocation). I also think they need a third http://java.sun.com/xml/jaxp/properties/noNamespaceSchemaLocation property. Comments are due by December 3.

Tuesday, October 30, 2001

ElCel Technology has released version 0.14 of their free-beer XML Validator and Canonical XML Processor command-line tools. This release adds support for basic HTTP authentication and FTP URLs, full Unicode character range under Windows, and some bug fixes.


Load 2.0.1 is an open-source test utility that can check Web applications and SOAP-based Web Services for performance and scalability. Load includes an XML-based scripting language and library of test object to let one build intelligent test agents. The agents can then be run concurrently to test Web apps in near-real production environments. Load uses JDOM to parse and build XML documents.


The SIA parser for XML sits on top of SAX. but additionally provides global state recognition automaton so it can uniquely identify any node within parsed XML.


sync4j is an open source Java implementation of the SyncML data synchronization protocol for PDAs and similar devices. sync4j uses JDOM for XML manipulation.


Media Design in*Progress's Interaction is a $795 payware XML based application server for Mac. The recently released 3.6 version performs server-side XSL Transformations and upgrades its style sheets editor to cover CSS Level 2. Interaction generates HTML from server-side XML and XSLT/CSS style sheets. The application integrates a native XML database and a document-type driven markup editor. It provides a component architecture and API for third party developers.


Sybase has submitted a Java Specification Request for ebXML CPP/A APIs for Java to the Java Community Process. This JSR aims to provide a standard set of APIs for representing and manipulating Collaboration Profile and Agreement information described by ebXML Collaboration Protocol Profile/Agreement documents. Comments are due by November 5.


Hewlett Packard's submitted a Java Specification Request for an XML Transactioning API for Java (JAXTX) to the Java Community Process. JAXTX proposes an API for packaging and transporting ACID transactions (as in JTA) and extended transactions (e.g., the BTP from OASIS) using the protocols being defined by OASIS and the W3C. Comments are due by November 5.


The Xerces-J team has posted the third beta of Xerces2-J, and open source XML parser for Java. This release fixes a number of bugs, introduces some changes to the Xerces Native Interface, and is the first Xerces2 release to include W3C XML Schema language validation.

Monday, October 29, 2001

Version 1.12 of SVG.pm, a Perl module for generating Scalable Vector Graphics pictures, has been released. Version 1.12 adds a lot of documentation and increases autoload quality.


Recordare's posted a beta version 0.5 of MusicXML, an XML application for common Western music notation used in printed sheet music. Recordare's also published a beta MusicXML Finale Converter that supports file exchange through MusicXML between Finale, SharpEye Music Reader, and the MuseData format. The MusicXML Finale Converter runs on Windows as a plug-in for Finale 2000, 2001, and 2002.


Ipedo's released version 2.0 of their expensive (pricing starts at $995) namesake Ipedo native XML database for Windows, Unix, and Linux. New features in this version include:

  • Free-form XML search
  • Scalable Vector Graphics (SVG) management
  • Distributed database management
  • Integrated, in database XSL transformations
  • Large document (up to one gigabyte) processing

JBrix has posted the first public release of Xybrix, an XML application framework in written in Java. It includes a Swing-based non-standard variant of XForms, XML-definable applications with file and window management, unlimited undo/redo of XML mutations, a graphical form designer component, and a speech plugin that enables Xybrix applications to be controlled by voice through the Java Speech API. Xybrix is distributed under the MIT open source license.


Altova's released XML Spy 4.1, a $399 payware XML editor for Windows. Version 4.0 was roundly criticized. I found it to be almost totally non-functional. Hopefully, most of the bugs have been fixed in this release. However, the announcement from Altova focused on new features instead including support for XSL Formatting Objects (XSL-FO) using FOP, more complete schema support, ODBC and ADO database access, a third party Plug-in architecture, and a new Windows XP look-and-feel. Upgrades are free for 4.0 purchasers, and way too expensive for purchasers of earlier versions. ($89 to $369 depending on version).


X-Hive Corporation has posted an online demo of its XQuery implementation backed by its X-Hive/DB native XML database.


The W3C Cascading Style Sheets (CSS) Working Group has published the CSS Mobile Profile 1.0 Candidate Recommendation. This specification defines a subset of the CSS2 tailored to the needs and constraints of mobile devices such as cell phones and Palm Pilots.


The Mind Electric has released Electric XML+ 3.0, a free-beer alternative API/class library for parsing and manipulating XML documents. Electric XML+ is a superset of Electric XML that includes the following additional features:

  • Transparent, bidirectional, Java to XML serialization.
  • Command-line tools for generating Java from XML schemas and XML schemas from Java.
  • An annotated schema system allows default mappings to be overridden without coding.
  • Transactional persistence for storing Java objects as XML documents
Sunday, October 28, 2001

Design Science has released MathType 5.0 for Windows, a $129 payware equation editor plug-in for Microsoft Word. One of its major new features is the ability to convert Microsoft Word documents containing equations into HTML with embedded MathML presentation markup. The equations can also be output as GIF images, if you prefer. A 30-day demo is available.

Saturday, October 27, 2001

The XML Apache Project has released release Xerces C++ 1.5.2, an XML parser written in C++ for Windows and various Unixes. The big new feature of this release is a broader subset of XML schema support. There's also support for progressive parsing in SAX2, project files for Borland C++ Builder 5, various bug fixes and performance improvements.

Thursday, October 25, 2001

Microsoft's posted a new beta of their MSXML 4.0 parser which they've renamed "Microsoft XML Core Services". This release is allegedly faster and much more standards compliant. However, there've been some disturbing reports that this parser (or perhaps the installer associated with it) disables some versions Microsoft Office. Consequently I recommend that you do not upgrade or install this. Microsoft has been informed of the problem, and may have a fix by the time you read this.

Wednesday, October 24, 2001

Bare Bones Software has released version 6.5 of BBEdit, the Macintosh programmer's editor I use to write this site. This release adds CSS support, an automatic language "guessing" feature to determine proper syntax coloring for tags, WML syntax checking, a new Perl compatible GREP engine, and various improvements in MacOS X. However, they don't mention any fix for what is in my opinion the biggest single hole in the current BBEdit: the inability to recognize Unicode text documents. BBEdit is $119 payware. Upgrades are $39, and cross-grades are $79.

Tuesday, October 23, 2001

Sun's Kohsuke Kawaguchi has released the RELAX NG Converter, a Java program that can convert schemas in a variety of languages including DTD, RELAX Core/Namespace, TREX and W3C XML Schema into RELAX NG schemas.

Monday, October 22, 2001

IBM's alphaWorks has released the IBM XSL Formatting Objects Composer (XFC). XFC is a Java program that "implements a substantial portion of XSL Formatting Objects (FO)" It can produce either an interactive on-screen display using Java2D, or an output file using PDF.

Sunday, October 21, 2001

Sun's Kohsuke Kawaguchi has written the XML Instance Generator, a Java program that reads a schema file and produces valid documents. You can also produce invalid documents, too. DTD, RELAX NG, RELAX Core/Namespace, TREX and W3C XML Schema are supported.

Saturday, October 20, 2001

David Brownell's posted a beta of SAX 2.0.1. This is a bug fix release, and most of the bugs fixed are in the documentation.

Friday, October 19, 2001

The W3C HTML Working Group has published the fourth public working draft of XML Events, a means of integrating event listeners and handlers with the Document Object Model (DOM) Level 2 event interfaces [DOM2].

Thursday, October 18, 2001

I've posted SAX, Chapter 6 of Processing XML with Java, here on Cafe con Leche. This chapter provides in-depth coverage of the ContentHandler interface, and discusses some common usage patterns for SAX.

As always, all comments are appreciated. Just email them to me at elharo@ibiblio.org. I have two questions in particular about this chapter. First, is the sidebar on SAX in other languages (Python, Perl, C++) accurate? I don't use these languages routinely, so it's entirely possible I've misrepresented the actual status of SAX in these environments. Secondly, is SAXSpider, my example of the Attributes interface, too complex for an example at this level? Would a simpler example (say verifying all the URLs in an XHTML or RDDL document) make a better example of the Attributes interface?

Wednesday, October 17, 2001

Eric S. Raymond has written doclifter, a Python program for converting legacy troff documents to DocBook."


Sun has posted a Schematron add-on for their multi-schema validator. It validates validate XML documents against RELAX NG schemas annotated with Schematron schemas.


Cisco Systems has submitted Java Specification Request 155 (JSR-155), Web Services Security Assertions to the Java Community Process. The goal is to "provide a set of APIs, exchange patterns & implementations to securely (integrity and confidentiality) exchange assertions between web services based on OASIS SAML." Assertions could include credentials, authentication, authorization, and sessions; and are based on top of XML digital signatures.


IBM's alphaWorks has updated three of its XML tools to fix some bugs in an underlying library and improve Unix support. The updated programs are:

  • Xplorer, a simple Validator program
  • XML Viewer, a GUI for displaying XML documents as trees
  • X-IT, a Java based application for batch processing of XML files.

I don't think there's any new functionality in any of these.

Tuesday, October 16, 2001

The W3C has released the 1.0 Recommendation of XSL Formatting Objects (XSL-FO). XSL-FO is a page description language like PostScript written as an XML application. The normal use scenario is that you write an XSLT stylesheet that transforms an input XML document into an XSL-FO document. The XSL-FO document describes how content should be styled, laid out, and paginated. You would then convert the XSL-FO document into a more convenient format such as PDF, PostScript. or TeX, from whence it could be viewed or printed on paper.

The spec lists the following changes since the Proposed Recommendation:

  1. "visibility" has been added to the properties applicable to fo:table-row, fo:table-header, fo:table-footer, fo:table-body.

  2. "inline-progression-dimension" has been added to the properties applicable to fo:table-cell.

  3. "clip" has been added to the properties applicable to fo:external-graphic and fo:instream-foreign-object.

  4. The initial value of "white-space-treatment" has been changed to "ignore-if-surrounding-linefeed" and the expansion of "white-space" for the values "normal" and "nowrap" has been changed to the same value.

  5. Space-resolution rules modified so that line-area spaces do not merge with other kinds of spaces.

  6. fo:table-cell: an implementation may recover from the border resolution error condition in the "collapse-with-precedence" case by selecting one of the borders.

  7. "border-after-precedence", "border-before-precedence", "border-end-precedence", and "border-start-precedence" have been added to the properties applicable to fo:table-column. The initial values of these properties have been changed to reflect this.

  8. The rectangle used for percentage calculations changed for the "extent" property.


The XML Protocol Working Group is seeking contributions to the SOAP Version 1.2 Test Collection. The tests are intended to prove (or perhaps disprove) that SOAP 1.2 meets its goal for conformance requirements, and that implementations exist for each of its features.

Monday, October 15, 2001

XInclude.net (whoever that is) has published a GPL'd XInclude implementation for the Microsoft XML Parser. At first glance, it looks roughly equivalent to my XInclude processor for Java; i.e. it can include complete documents but not parts thereof because it does not support XPointer. There are some conformance bugs in the XInclude.net program, but they're not too large.


The problems I noted with the Macintosh version of Mozilla on Friday (complete failure to display almost all Web pages) appear to have been a conflict with Steve Falkenburg's WebFree, a shareware control panel that filters ads. Disabling it restored Mozilla, and also fixed some problems I had selecting text in Netscape 4.7. Furthermore, Netscape used to fall apart once I opened more than a dozen or so simultaneous windows, and it hasn't done that since I turned off WebFree.

WebFree hasn't been updated in four years. I certainly hope it will get an update soon. Since I started using it, it's blocked over 10,000 ads. When I turned it off I was amazed at how ugly and annoying a lot of my favorite web sites suddenly became. Still, if WebFree is destabilizing my system, then it's got to go. It's time to start looking for alternative ad blocking software. Fortunately Mozilla itself now can block pop-up windows, though not yet inline ads.


I'm happy to note that the W3C has decided to continue work on its new patent policy rather than rushing to a premature conclusion. A second public Last Call for the W3C Patent Policy Framework is planned. Very importantly, free software luminaries Eben Moglen and Bruce Perens are joining the Patent Policy Working Group (PPWG) as invited experts.

Sunday, October 14, 2001

The XML Apache Project has posted a release candidate of FOP 0.20.2, an open source XSL-FO to PDF converter.


jCatalog Software AG has posted a beta of XSLfast, a graphical editor for XSL-Formatting Objects (XSL-FO) documents. XSLfast also supports mail merge and forms processing. The current beta is free-beer, but the release version will cost a yet to-be-determined price.

Saturday, October 13, 2001

Version 0.9.5 of the open source Mozilla Web browser has been released for the usual batch of platforms (Windows, Linux, MacOS, Solaris, etc.) New features in this release include:

  • The History and Mail & News applications now allow you to reorder columns with drag and drop.

  • Warnings in the JavaScript console now show the text of the offending line.

  • Venkman, the JavaScript Debugger is now available in complete installer builds.

  • A new experimental Tabbed Browsing feature. Press Ctrl+T to open a new tab.

  • SOCKS proxies (both v4 and v5) can now be used with all protocols except MailNews.

  • A new Site Navigation Bar for navigating sites that use the LINK element (like Bugzilla buglists.) Choose the menu item View | Show/Hide | Site Navigation bar | Show As Only Needed to make the toolbar show up automatically when you visit pages that use the LINK element. (Could this be adapted to work with extended XLinks?)

  • The View Source window now has a context menu with items for Find, Copy, and Select.

Mozilla has been my default browser on Windows and Linux for several months now, with only occasional crashes and a few minor cosmetic glitches, particularly involving lists. However, in 0.9.4 the Mac version was completely non-functional. Hopefully, that's improved in 0.9.5. I'm downloading it now and I'll let you know.

Friday, October 12, 2001
Cover of the XML Bible, Gold Edition

I'm pleased to announce the release of the XML Bible, Gold Edition. At just barely under 1600 pages, this is my largest book yet, and possibly the largest XML book published to-date. In fact, I wrote more pages than the printer could glue between two covers, so this edition is also my first English language hard back. Why so big? Because I needed to fit in everything that was in the second edition plus oodles of new and updated material including:

  • RDDL
  • XInclude
  • XML Base
  • Canonical XML
  • Modular XHTML
  • DTD Modularization
  • SMIL
  • CSS Level 3

In addition, the treatment of several existing topics was expanded and/or updated significantly including schemas, XSL Formatting Objects, and the design of XML applications. The XML Bible, Gold Edition has arrived at pretty much all bookstores that carry computer books including Amazon, FatBrain, and Barnes & Noble. The list price is $69.99, though most stores are offering their usual discounts. If you need to special order it, the ISBN number is 0-7645-4819-0, and it's written by me, Elliotte Rusty Harold. I'll get the web pages for the Gold edition online here at Cafe con Leche soon, but in the meantime the sample chapters from the second edition are pretty much what's been printed in the Gold edition. In fact, the updated schemas chapter I've posted here was actually taken from the manuscript for the Gold edition.

The plan is to keep selling both the second and gold editions simultaneously which raises the obvious question of which you should buy. On the down side, the Gold edition costs $20 more, and is about 2 pounds heavier. If you're strapped for cash or bookshelf space, you should probably choose the second edition. On the upside, the Gold edition is a little more up-to-date. If you have a particular interest in any of the new topics listed above or schemas, I'd suggest getting the Gold edition. Otherwise the second edition will probably serve you well without inducing a hernia.

For this edition, I focused on adding new material, not on rewriting and updating what had already been written. If you already own the second edition, it's probably not worth upgrading unless you have a particular need for one of the new chapters. It's only been about 5 months since the second edition came out, and pretty much everything in that edition is still current. On the other hand, if you're still using the first edition, it might be time to upgrade. The first edition is now over two years old, and there've been a lot of developments in the XML world since 1999. I'm sorry but there's no special upgrade price. Unlike software, the incremental cost of printing extra copies of a book is a non-trivial. However, if you did like the first edition, I think you'll like the Gold edition even more.


Tuesday, October 9, 2001

James Anderson's posted cl-xml 0.915, a collection of Common LISP modules for XML parsing and serialization including a validating, namespace-aware XML parser. There's also XPath and XQuery support.

Monday, October 8, 2001

The W3C XML Protocol Working Group has posted two new working drafts for SOAP, SOAP Version 1.2 Part 1: Messaging Framework and SOAP Version 1.2 Part 2: Adjuncts. Previously this was a single monolithic document. Part 1 describes the SOAP envelope and SOAP transport binding framework. Part 2 describes the SOAP encoding rules, the SOAP remote procedure call (RPC) convention, and a concrete HTTP binding. I haven't read the entire specs yet, but I do notice that SOAP is still using local, unqualified child elements in the fault codes. Could somebody please fix this brain damage before it spreads? Thanks. There's really no reason not to put all SOAP defined elements in the appropriate SOAP namespace.

Sunday, October 7, 2001

Ronan Oger's written SVG.pm, a Perl module for server-side generation of SVG images.

Saturday, October 6, 2001

The W3C HTML workinng group has posted the first public working draft of the second edition of XHTML 1.0. Like the second edition of XML 1.0, this draft does not propose changing the language in any way (there's already an XHTML 1.1 after all). It merely clarifies and corrects errors in the specification.

Friday, October 5, 2001

The first alpha of the Gnome 2.0 window manager for Linux has been posted. New features include:

  • Full multi-lingual text support support and use of Unicode throughout
  • New tree and text widgets
  • Extensive accessibility support
  • AA new CORBA ORB featuring smaller stubs skeletons and SSL support
  • A more compliant XML library
  • An XSLT library
  • Improved package configuration

Sun's posted a beta of the Star Office 6.0 word processor/spreadsheet/presentation/drawing suite. The native file format for this package is XML.

The open source Open Office equivalent for this release is OpenOffice.org Build 638c. This build offers greatly improved stability, English spell checking and thesaurus, and full online help.

Thursday, October 4, 2001

Version 1.1.3 of the Crimson validating XML parser for Java has been released. This version fixes assorted bugs.

Tuesday, October 2, 2001

I'm at XMLOne in San Jose this week. I've posted the notes for my presentations here including:

Monday, October 1, 2001

Software Development 2002 West is being held April 22-26 in San Jose, California, and I have once again signed on to chair the XML track. The call for papers to anybody who'd like to participate in the XML or any other track.

The audience at this show consists primarily of working programmers with a bias toward Java and C++, mostly on the Windows platform though Unix/Linux is well represented as well. At East this year, I noticed a lot of interest in .Net; and there's a .Net track at West as well, so we're definitely interested in presentations about XML in the .Net environment (and other things in .Net as well, though I'm not personally responsible for that). Web services is a big focus as well, though again that goes in a separate track that I'm not responsible for.

When it comes to XML, we find that our attendees really like introductory tutorial sessions. For example a tutorial on the W3C XML Schema Language that covered the basics and assumed no prior experience with schemas would be useful. An advanced session on schema best practices, however, would probably not be of interest to most of our attendees and might be better saved for one of the XML-specific conferences like XMLOne or XML 2001. At this stage it is OK to assume basic knowledge of XML, well-formedness, and DTDs. You do not need to begin your talks with "this is what a tag is", "this is what an attribute is", etc.

Most sessions at SD are 90 minutes. We also have room for a few full-day tutorials. However, for these we prefer speakers who have a good track record presenting at previous SD conferences, so if this is your first time with us, it would be better to submit proposals for 90-minute seminars. We're also accepting proposals for BOFs, panels, and similar events. If you're interested, please fill out the form at http://www.sdexpo.com/2002/west/speakers/abstracts.htm If you have any questions, send them to me in e-mail. Thanks!

Sunday, September 30, 2001

The W3C P3P Specification Working Group has returned the Platform for Privacy Preferences 1.0 (P3P1.0) Specification to Last Call working draft due to substantive changes made during the candidate recommendation period. The new last call ends October 15. P3P allows Web sites to express their privacy practices in a standard XML format that can be retrieved automatically and easily interpreted by browsers.

Saturday, September 29, 2001

I'm flying to San Jose tomorrow for the XMLOne conference so updates are likely to be sporadic until Friday. However, San Jose isn't nearly as interesting a town as Amsterdam so unlike last week I should find some time to check my e-mail and update my sites. :-)


The W3C Document Object Model working group has posted an updated working draft of the DOM Level 3 Core Specification. I didn't notice any major changes in this draft compared to the last one, but so far I've only skimmed it.


The Resource Description Framework (RDF) Core Working Group has posted the first public working draft of RDF Test Cases, "a set of machine-processable test cases corresponding to technical issues addressed by the WG. This document describes the test cases that will fullfill (when the test cases are completed) that deliverable but it does not contain the test cases themselves. The test case are available at http://www.w3.org/2000/10/rdf-tests/rdfcore/."

Friday, September 28, 2001

The W3C Internationalization Working Group has pushed the Character Model for the World Wide Web back from Last Call to simple working draft status. This document "provides authors of specifications, software developers, and content developers a common reference for interoperable text manipulation on the World Wide Web. Topics addressed include encoding identification, early uniform normalization, string identity matching, string indexing, and URI conventions, building on the Universal Character Set, defined jointly by Unicode and ISO/IEC 10646. Some introductory material on characters and character encodings is also provided."


The RDF Core Working Group has published the first public working draft of RDF Model Theory. According to the abstract:

This is a specification of a model-theoretic semantics for RDF and RDFS, and some basic results on entailment. It does not cover reification or special meanings associated with the use of RDF containers. This document was written with the intention of providing a precise semantic theory for RDF and RDFS, and to sharpen the notions of consequence and inference in RDF. It reflects the current understanding of the RDF Core working group at the time of writing. In some particulars this differs from the account given in Resource Description Framework (RDF) Model and Syntax Specification, and these exceptions are noted.

Honestly, I didn't understand half of this document, which is full of references to things like the "Strong Herbrand Lemma" and "N-triples syntax". Still, if you're a mathematician who specializes in graph theory, you might find this amusing. The word of the day is "skolemization". Just don't ask me to explain what it means. :-)


The W3C XML Schema Working Group has published an updated working draft draft of XML Schema: Formal Description. According to section 1,

This formalization is a formal, declarative system for describing and naming XML Schema information, specifying XML instance type information, and validating instances against schemas. The goals of the formalization are to:

  • Provide a semantic framework for software systems that use the W3C XML Schema specification, such as the W3C XML Query Algebra.
  • Specify names for all components of an XML Schema, so that they can be uniquely identified by URIs. Such unique identifiers may be useful to XML Query, RDF, and topic maps, among others.
  • Formally define validation at a declarative level.
  • Define the mapping from the current XML Schema syntax onto the structures described here, as well as the mapping between the XML Schema component mode and our component model.

If you're of the opinion that this might have been a good thing to do before the schema speicifcation was finished, well, let's just say you're not alone.

Thursday, September 27, 2001

Hewlett-Packard's FOA is a GUI XSL-FO authoring tool written in Java that assists with pagination, page sequences and creates the transformation elements to convert multiple XML content files into XSL-FO. FOA generates an XSLT stylesheet that reads multiple XML documents and applies the style from multiple Attribute Set files according to the transformation elements that you have defined.


jfor is an open-source XSL-FO to RTF converter written in Java. The current version is 0.5. As with all such tools available today, support for all parts of the XSL-FO spec is incomplete. Currently, jfor has limited support of blocks, inline elements, lists, tables and images.


James Clark has released Jing, an open source validator for the RelaxNG schema language written in Java that supports the August 11, 2001 draft of the RELAX NG Specification.


Daisuke Okajima's released RelaxNGCC 0.3, a tool for generating Java source code from a given RelaxNG grammar. By embedding code fragments in the grammar, you can take appropriate actions while parsing valid XML documents against the grammar. This approach is similar to yacc, bison, or JavaCC.

Wednesday, September 26, 2001

I've posted version 1.0d8 of my XInclude processors for DOM, SAX, and JDOM. The only API change is that the XIncluder class for JDOM is now named JDOMXIncluder to parallel SAXXIncluder and DOMXIncluder. It was originally just called XIncluder because it was the first one I wrote, but I'll probably use that name for a generic driver that uses a command line argument to select which API to use. Internally, the major change in this release is that I've now implemented substantially better heuristics for correctly detecting the encoding of text files included with parse="text".

Tuesday, September 25, 2001

The dbXML project has posted the second beta release of the dbXML Core XML database, an open source native XML database designed to manage large collections of small XML documents. The server supports XPath queries and provides an implementation of the XML:DB XML Database API for development of client applications. This beta fixes various bugs.


The W3C Device Independence Activity has published its first public working draft, Device Independence Principles. According to the abstract, "This document celebrates the vision of a device independent Web. It describes device independence principles that can lead towards the achievement of greater device independence for Web content and applications." The goal is to allow web pages to be presented not just to different Web browsers, but to telephones, PDAs, kiosks, digital paper, fax machines, televisions, and other radically different hardware.


The W3C/IETF joint URI Planning Interest Group has published t a note on URIs, URLs, and URNs: Clarifications and Recommendations. According to the abstract:

This paper addresses and attempts to clarify two issues pertaining to URIs, and presents recommendations. Section 1 addresses how URI space is partitioned and the relationship between URIs, URLs, and URNs. Section 2 describes how URI schemes and URN namespace ids are registered. Section 3 mentions additional unresolved issues not considered by this paper and section 4 presents recommendations.

It makes interesting reading. I recommend it for anyone who's ever been confused about the difference between URLs, URIs, and URNs, and exactly what makes a legal URI anyway.


The W3C CSS Working Group has published the first public working draft of the Backgrounds module for CSS Level 3. Properties in this document would be used to specify background colors, images, and so forth. New properties added since CSS2 include background-size, background-clip, and background-origin.


The W3C P3P Working Group has published the last call working draft of the Platform for Privacy Preferences 1.0 Specification. P3P is an XML application browsers and server can use to exchange information about web site privacy policies and user choices for what information they're willing to share according to what preferences. Comments are due by October 15.


The W3C User Agent Accessibility Guidelines Working Group has posted the candidate recommendation of User Agent Accessibility Guidelines 1.0. According to the abstract:

This document provides guidelines for designing user agents that lower barriers to Web accessibility for people with disabilities (visual, hearing, physical, and cognitive). User agents include HTML browsers and other types of software that retrieve and render Web content. A user agent that conforms to these guidelines will promote accessibility through its own user interface and through other internal facilities, including its ability to communicate with other technologies (especially assistive technologies). Furthermore, all users, not just users with disabilities, are expected to find conforming user agents to be more usable.

In addition to helping developers of HTML browsers, media players, etc., this document will also benefit developers of assistive technologies because it explains what types of information and control an assistive technology may expect from a conforming user agent. Technologies not addressed directly by this document (e.g., technologies for braille rendering) will be essential to ensuring Web access for some users with disabilities.


Sun's posted the proposed final draft (version 0.94) of the Java API for XML Messaging Specification. As usual, it's PDF format only. JAXM implements the Simple Object Access Protocol (SOAP) 1.1 with Attachments.


Apropos Toy & Tool Development has released RoustaboutXT 2.0, a $300 payware Quark XPress XTension that imports and exports XML from Quark. This release adds full import/export capabilities with support for anchored text and pictures; and a hitch feature to allow the XTension to be used as traditional text filter for use with Applescript and batching XTensions. Upgrades from version 1.0 are free.

Apropos has also released version 2.0 of XPress XML, a $50 payware Quark XTension that also imports and exports XML from Quark but is much less customizable. Version 2.0 also adds the "Hitch" feature.

Monday, September 24, 2001

I've returned from XMLOne Amsterdam. Several speakers couldn't make it due to the attacks on September 11. Consequently those of us who did get there ended up talking more than we had planned. I had planned to talk for three hours total, and ended up tripling that. Talks I gave included:

The last one was written by Wendell Piez of Mulberry Technologies. I just delivered the lecture from Piez's notes because he couldn't make it to the show. You'll have to ask him if you want a copy of the notes.


The Apache Cocoon team has posted the first release candidate of the Apache Cocoon 2.0 XML publishing framework. Version 2.0 is supposed to "remove all those design constraints that emerged from the Apache Cocoon 1 experience."


Michael Kay's released version 6.4.4 of his Saxon XSLT processor. This release fixes 13 bugs, updates Saxon to work with FOP 0.20.1 and JDOM 0.7, and adds a few performance optimizations that "will give a very substantial speed-up to a rather small number of stylesheets." Saxon is written in Java, but a Windows executable version is also available.


The W3C XML Core working group has published a new working draft of the XML Blueberry Requirements. This document proposes a new, incompatible version of XML that allows characters introduced in Unicode 3.0 and later to be used in element and attribute names. (They can already be used in character data.) Furthermore, it intends to allow the use of certain mainframe line ending control characters in tag white space as well.

There are no significant changes in this draft. The real purpose of this draft seems to more clearly express the working group's motivation that "Discriminating against languages simply because their scripts were not encoded in Unicode 2.0 is inherently unjust." The working group continues to ignore the costs this will impose on the existing XML community, and indeed considers any cost-benefit analysis to be immoral.

Numerous objections that were raised against the earlier draft set of requirements remain unanswered, including:

  • Native language markup is not considered as a separate issue from supporting non-standard mainframe line-ending conventions.
  • There's no requirement specifying that they should make only the minimum changes necessary to meet their goals.
  • There's no requirement specifying that every effort should be made to insure that any document that does not need to use Blueberry does not use Blueberry.

The working group is pretending that this is 1995, that nobody has any investment in the existing infrastructure, and that anything less than the best solution imaginable today in 2001 is unjust and unacceptable. But it's not 1995. People do have huge investments in the existing XML syntax. Changing it now has real costs that will harm real people. The working group refuses to even consider these costs or to tally up the benefits expected and compare them to the costs. They have decided that this will be done and that no justification is necessary.


The XML Linking Working Group has published a new Candidate Recommendation of XPointer. At first glance, the XPointer syntax and semnatics does not seem to have changed. However, the specification document has been cleaned up, and some points have been clarified, though you still have to read between the lines to figure out how to write an XPointer that selects a point.

Sunday, September 16, 2001

Theodore H. Smith has released ElfData 1.11 for MacOS, a $55 payware XML editor. This release mostly fixes bugs and improves performance and appearance. This release also adds syntax coloring.


Tony Graham's posted revision 0.2 of xslide, an Emacs major mode for editing XSL stylesheets . Features of xslide revision 0.2 include:

  • XSL customization group for setting some variables
  • Initial stylesheet inserted into empty XSL buffers;
  • "Template" menu for jumping to template rules, named templates, key declarations, and attribute-set declarations in the buffer;
  • xsl-process function that runs an XSL processor and collects the output
  • Predefined command line templates and error regexps for Java and Windows executable versions of both XT and Saxon;
  • Font lock highlighting so that the important information stands out
  • FO font lock highlighting updated to match XSL PR.
  • xsl-complete function for inserting element and attribute names
  • xsl-insert-tag function for inserting matching start- and end-tags
  • Automatic completion of end-tags
  • Automatic indenting of elements with user-definable indentation step
  • Comprehensive abbreviations table to further ease typing.
Saturday, September 15, 2001

I've posted Reading XML, Chapter 5 of Processing XML with Java. This chapter is a broad overview of the various parsers and APIs available for processing XML documents. It provides example of DOM, SAX, JDOM, dom4j, and ElectricXML and discusses the relative advantages and disadvantages of each. SAX and DOM will definitely be covered in more detail in future chapters.

The coverage of dom4j and ElectricXML was directly inspired by comments from Cafe con Leche readers I received on earlier chapters. After exploring these APIs in more detail, I found some major design flaws in ElectricXML so it probably won't be covered past this chapter. dom4j actually looks pretty well designed though, and may be the best of the non-standard APIs. I need to try it on a tougher problem like my XInclude processor and see how it holds up. I'm also debating whether I should add a section to this chapter on kXML, a parser that doesn't implement any of the standard APIs but will run in J2ME environments.


Mozilla 0.9.4 has been released with fixes for 1,467 bugs. As usual it's available on Windows, MacOS, OpenVMS, Solaris 8 and Linux. New features include the ability to disable the JavaScript window.open() method during page load and unload events to eliminate those annoying pop-up and pop-under ads. (This should be turned on by default!) I found 0.9.3 to be significantly less stable than earlier releases. Hopefully, this release fixes the problems I was experiencing.


Delta tells me that my flight to Amsterdam is still scheduled to take off on time tomorrow, so I guess I'm going to go the airport and hope. If the FAA does let it leave the ground, I'll see some of you at XMLOne next week.

Friday, September 14, 2001

Slashdot's posted a nice review of XML in a Nutshell. I'm going to try to participate in the discussion, but the general slowness of the net today is making it hard.


Wolfgang Meier of the Darmstadt University of Technology has posted version 0.6 of eXist, an open source native XML Database with pluggable storage backends and support for fulltext searching. This is mainly a bug fix release. However, there are some new features including client side support for the XML:DB XML Database API. More volunteer developers are needed.


My Speakeasy DSL went down yesterday afternoon as a side effect of Tuesday's attacks. Several other ISPs in the area are affected as well. Apparently a backup generator overheated, and nobody can get to the site to repair it. For the moment I'm on a dialup connection. To complicate matters, I'm scheduled to leave Sunday for the XMLOne conference in Amsterdam. I don't know if I'm going to be able to get out or not; but whether I do or not updates are likely to be a little sporadic here for the next week or so.

Also, if there's anybody out there who's planning to be in Amsterdam next week and would be willing to cover my schemas talk for me if I can't make it, please drop me a line. I'd be happy to share my notes with you. It's a pretty standard, two hour intro to schemas talk. I'm also scheduled to deliver a session on Cutting Edge XML Programming, but that's a much more idiosyncratic session wandering all over the map including SAX 2.1, DOM3, XPath 2.0, XSLT 2.0, and XQuery.



Fourthought, Inc. has released version 0.11.1 of the 4Suite 0.11.1 and 4Suite Server 0.11.1 open source libraries for processing XML in Python. 4Suite "provides support for XML parsing, several transient and persistent DOM implementations, XPath expressions, XPointer, XSLT transforms, XLink, RDF, XInclude, XUpdate and ODMG object databases." 4Suite Server "features an XML data repository, metadata management, a rules-based engine, XSLT transforms, XPath and RDF-based indexing and query, XLink resolution and many other XML services. It also provides transactions and access control features. Along with basic console and command-line management, it supports remote, cross-platform and cross-language access through CORBA, WebDAV, HTTP, FTP and other request protocols."

Thursday, September 13, 2001

Altova's released version 4.0 of XML Spy, their popular $399 payware XML editor. New features include:

  • Improved support for the final XML Schema Recommendation including fixed, redefine, abstract, nillable, and xsi:nil
  • Automatical conversion of schemas from the April 7 working draft or October 24 Candidate Recommendation to the final version
  • Conversion of DTDs, XML-Data, and BizTalk schemas to the W3C XML Schema Language
  • Complete support for Undo in the XSLT Designer
  • More online help
  • A new license manager
  • The XML Spy 4.0 Document Framework that "provides the customer with a highly user-friendly interface - very much like a typical word processor - that allows for true XML content editing and creation."
  • The XML Spy 4.0 XSLT Designer, a graphical XSLT stylesheet creation tool that enables the customization of the document editor by defining an XSLT Stylesheet and additional editing-specific options based upon the underlying DTD or XML Schema for use during the content creation or editing process.

I tested the last public beta of this release and was not impressed. It was agonizingly slow when loading and saving files, probably because it insisted on validating before it did either one. Neither the Document Framework nor the XSLT Designer actually worked on the Docbook documents and stylesheets I tested them with, and I've seen reports that this is still true in the final release version. If you're using XML Spy 3.5, I reocmmend you don't upgrade until at least the next bug fix release. If you're not using XML Spy, I recommend you stick with whatever you are using until at least the next major release.


Wednesday, September 12, 2001

I'm still feeling a little shocked from the bombings yesterday. The magnitude of this is still seeping in. Personally, I feel quite fortunate. So far all my friends, family, and colleagues seem to have come through OK. There's still one person I need to check on. (Update: I got in touch with her this afternoon and she's fine.) For those of you who aren't New Yorkers, just know that a lot of people worked in the towers at all levels from CEO to janitor. Everyone in New York knows someone who worked there, often many people. This attack really cuts across all levels of the city. As somebody (I forget who. It might have been Giuliani.) said on the news last night, this attack did not single out whites or blacks or Jews or Arabs or Christians or Muslims or Chinese or police officers or civilians or citizens or foreign nationals or immigrants or any other group. This was an attack on all of us.

This has been the main topic of discussion on many of the mailing lists I participate in, with topics ranging from XML to Unicode to computer book publishing. None of these are political lists. Mostly the list moms seem to be willing to let the discussion flow without worrying too much about how off-topic it is. That's important in times like this. Sentiment seems to be divided about 50-50 between the "Let's get revenge and bomb somebody, even if we aren't sure exactly who" and "Let's focus on catching the criminals, and try to understand why this happened." Private e-mail about my comments yesterday is about equally divided.

I wish I could say that the television media showed an equal diversity of opinion, but I'm afraid that's not true. CNN, CBS, and the local stations would only interview the same white war mongers and law enforcement agents who kept insisting that we needed to start bombing people before we even knew who was responsible, that we needed to repeal the restriction against assassination (James Baker), and that we were going to have to give up our civil liberties. Dissenting voices like Noam Chomsky, Howard Zinn, or Norman Siegel were even more absent from the air waves than they normally are.

I finally found some intelligent rational discussion on Channel 54, BET. They were interviewing Al Sharpton, Jesse Jackson Sr., and others who had a much more cautious view. It was the only place in the mass media where I saw any recognition that there were likely reasons this attack occurred, and that we needed to address the root causes as much as the symptoms. It was one of the few places on TV I saw any concern for avoiding civilian casualties in the inevitable retaliation for this attack. The white mass media likes to portray people like Sharpton and Jackson as fringe nutcases. What was apparent last night was that the fringe is a lot more reasoned, informed, and honest than the mainstream. If either of them runs for anything again, they've got my vote. We need more people like this.

I promise I'll get back to Java and XML tomorrow. Right now I'm just having a little trouble focussing on such things.

Tuesday, September 11, 2001

I'm continuing to watch the spin about the attacks today. The level of racism and Anti-Arab hysteria seems to be rising slowly through the day, though there's still no evidence of the race or nationality of the criminals. And of course even if it does turn out that the terrorists were in fact Arabs, that's no justification for retaliatory attacks against Arab countries and civilians.

Many people are calling this an act of war and comparing it to Pearl Harbor. In reality, this isn't even close. Pearl Harbor was a deliberate attack by the military forces of a hostile nation. In all likelihood, no nation and no government was behind this assault. That of course, is what terrorism means: a violent assault by non-government individuals and groups for political purposes. When nations commit violent assaults against civilian populations, it's not called terrorism even though the effects are much worse.

I'm also hearing a lot of talking heads from the FBI and other law enforcement agencies blaming this attack on the openness of our society, our rights, and our civil liberties. Of course the obvious implication is that we need to reduce these and unleash law enforcement to combat attacks like today's. In truth, of course, a police state would have done nothing to prevent today's attacks or one in the future.

I've also watched pictures on the news of Palestinians celebrating in the street. A few people have expressed concern that Americans are hated around the world, and that consequently we need to beef up our defenses. Nobody seems willing to ask the question of why we're so hated that school children are celebrating the violent murder of thousands of people. The answer, of course, is that while what happened today is completely new to the American experience, it's not at all new in many other places around the globe. In Palestine, in particular, the people have been brutally assaulted and murdered by terrorist planes for decades. Many of those planes are American made and/or paid for with American dollars. The attacks today may or may not have been retribution for the decades of oppression of Palestinian people, but we should not be particularly surprised that the oppressed take a little glee in a successful attack on a previously invulnerable oppressor.

The real solution cannot be found in retaliation. Violence only begets more violence. While I hope that any living collaborators in this attack are captured and severely punished, I see no point in going to war against Palestine, Afghanistan, or anyone else. Terrorist actions of this nature are inevitable as long as the United States persists in its decades long suppression of legitimate aspirations for self-rule around the world. This is nothing new. The U.S. actively backed repressive client states in Greece, in Italy, in Guatemala, in Nicaragua, in El Salvador, in Vietnam, in the Philippines, in Iraq, in Iran, in Palestine, and many other places. The only thing that surprises me is that it's taken this long for responses on our home soil to reach this magnitude. Until the U.S. is willing to honestly address why we're hated, no security measures will be sufficient.


Update: after a few hours of watching the coverage on the news and listening to people on various mailing lists, mostly New York local, it seems increasingly probable that we're going to go to war with somebody even if the country we're attacking had nothing to do with the attack. The most likely target of opportunity seems to be Afghanistan. Based on past experience, there probably won't be any real war, just a bunch of planes dropping bombs from a couple of miles up. This would solve nothing. That a few idiots without government support killed possibly hundreds of American civilians is not an excuse for us to retaliate by killing several hundred civilians in their country who had equally little to do with this.


I'm sitting here in New York switching between CNN and local channels. I just saw the second tower of the World Trade Center go down. This is truly horrible. I know many people who work there. I'm hoping they're alright. I think there was a Sun office there, although I don't remember off the top of my head whether it was in one of the towers or one of the three smaller buildings. It might even have been a local Sun reseller. One of the local user groups had meetings there, though I stopped attending after the security screening refused me admission one night because I wasn't on the list of attendees. Either way, I hope everyone got out. The attack happened relatively early in the morning on an election day here in New York so maybe the buildings weren't as full as they might have been. We can only hope. (I just heard the election has been postponed.)

So far the coverage I've seen has been relatively restrained. Oklahoma City taught most networks that they couldn't automatically blame the Arabs for every terrorist attack. Maybe some politicians learned their lessons too and we won't hear the calls for immediate retaliation against some target before we even know whose responsible for this horror.

Longer term there are going to be a lot of calls for higher security measures at airports, restricting travel from unfriendly nations, more surveillance cameras, and various other Big Brother measures. It's important to remember that none of these did any good today. After the last bombing, the World Trade Center had some of the tightest security of any general office building in New York; and it didn't help. We don't know yet what happened at the airports where the planes left, but all the requirements that people show ID and go through various security checkpoints didn't stop this either.

Whoever was immediately responsible for this atrocity is undoubtedly dead along with any passengers who may have been on the planes. We may never know their names, and we certainly won't be able to exact vengeance on them. I'm also afraid though that somebody's going to be brought to trials for this, whether they're actually responsible or not. If we can indeed find the murderers responsible for this act, then they deserve to be put in prison for the rest of their natural lives. However, I'm very worried that the government can't find the people responsible, then they'll find somebody anyway, no matter how tangential their connection to the actual bombers. We have to be wary of show trials designed to do nothing more than assuage our desire to make someone pay.

Monday, September 10, 2001

The W3C HTML Activity. has posted a new working draft of the XForms 1.0 specification. This describes the "next generation of web forms". Today's HTML forms don't distinguish between the purpose and the presentation of a form. XForms, by contrast, have separate sections that describe what the form does and how the form looks, thus making XForms more suitable for a broad range of media from web browsers to cell phones. The XForms themselves and the data collected in an XForm and returned to a server are all written in XML. An XForms Submit Protocol defines how XForms send and receive data, including the ability to suspend and resume the completion of a form. The changes in this draft are too numerous to list here, but are nicely summarized in an appendix to this draft.


The W3C Web Accessibility Initiative Protocols and Formats Working Group has published a working draft of XML Accessibility Guidelines. From the abstract:

This document explains how to design accessible applications using XML, the Extensible Markup Language. Compared to the HTML or MathML languages, XML is one level up: it is a meta syntax used to describe these languages, as well as new ones. As a meta syntax, XML provides no intrinsic guarantee of device independence or textual alternate support. It is essential, therefore, that XML formats and tools designers are provided with guidelines that explain how to include basic accessibility features - such as those present in HTML, SMIL, and SVG - in all their new developments.

The W3C has promoted SMIL Animation to full Recommendation. According to the abstract, this specifies an "animation functionality for XML documents.. It describes an animation framework as well as a set of base XML animation elements suitable for integration with XML documents. It is based upon the SMIL 1.0 timing model, with some extensions, and is a true subset of SMIL 2.0. This provides an intermediate stepping stone in terms of implementation complexity, for applications that wish to have SMIL-compatible animation but do not need or want time containers."


The Resource Description Framework (RDF) Core Working Group has posted the first public Working Draft of Refactoring RDF/XML Syntax. The document records the process of updating the grammar in the Resource Description Framework (RDF) Model and Syntax Specification, showing the changes step-by-step.


xmlsecurity.org has published an open source implementation of Canonical XML in Java. It has also started work on an implementation of XML Signature. Currently only source code is available. You'll have to compile the package yourself.

Sunday, September 9, 2001

Norm Walsh has released version 1.44 of his XSLT stylesheets for Docbook, that I use to generate both the HTML and PDF versions of Processing XML with Java. This release fixes various bugs including some nasty ones involving the handling of dingbat characters such as curly quotes.


Keith Isdale's released xsldebugger 0.4, a GPL'd gdb-like tool to debug XSLT stylesheets built on top of libxslt. It currently runs on Windows.

Saturday, September 8, 2001

The W3C has posted the final recommendation of the Scalable Vector Graphics (SVG) 1.0 Specification. Changes since the proposed recommendation appear mostly editorial.

Adobe's posted the first beta of version 3.0 of their SVG Viewer browser plug-in for Netscape and Internet Explorer on Mac and Windows. This release should be closer ot full conformance with the 1.0 Recommendation. Newly supported parts of SVG include:

  • color-profile, marker, title, view, and switch elements
  • The color-profile, marker, marker-end, marker-mid, marker-start CSS properties, the @media CSS rule, and the media attribute for style elements. The values all, screen, and print are supported.
  • The SVG image element now supports links to static SVG files
  • The ElementTimeControl::beginElement, ElementTimeControl::endElement, getCTM, getElementsByTagNameNS, and getBBox methods and the SVGMatrix class from the SVG DOM.

Other new features include:

  • A built-in script engine that allows self-contained JavaScript scripts to run inside of SVG files embedded in hosts which don't support a bridge between plug-ins and the host script engine, including Internet Explorer on the Mac.
  • The Windows ActiveX control can now be invoked in a transparent mode which allows you to overlay SVG on top of other web page content.
  • The Windows ActiveX control supports Binary Behaviors.
  • Animations continue to run even when the browser is in the background.
  • Can save as compressed SVG (.svgz)
  • Native plug-ins for MacOS 10.1 and Windows XP
  • The ICC color management component has been unbundled.
  • Windows preferences are stored in the registry

in-GmbH's Sphinx SVG is a $253.58 drawing program that can export (but not import) SVG graphics.

Friday, September 7, 2001

Altova's posted a public beta of XML Spy 4.0, a payware XML editor. Registration is required to download the beta and beta testers should not expect a free copy when the product is released. New features include:

  • Improved support for the final XML Schema Recommendation including fixed, redefine, abstract, nillable, and xsi:nil
  • Automatical conversion of schemas from the April 7 working draft or October 24 Candidate Recommendation to the final version
  • Conversion of DTDs, XML-Data, and BizTalk schemas to the W3C XML Schema Language
  • Complete support for Undo in the XSLT Designer
  • More online help
  • A new license manager
  • The XML Spy 4.0 Document Framework that "provides the customer with a highly user-friendly interface - very much like a typical word processor - that allows for true XML content editing and creation."
  • The XML Spy 4.0 XSLT Designer, a graphical XSLT stylesheet creation tool that enables the customization of the document editor by defining an XSLT Stylesheet and additional editing-specific options based upon the underlying DTD or XML Schema for use during the content creation or editing process.

This is the first version of XML Spy I've tried in a while, and overall I was not impressed. It was agonizingly slow when loading and saving files, probably because it insisted on validating before it did either one. Neither the Document Framework nor the XSLT Designer actually worked on the Docbook documents and stylesheets I tested them with. Maybe they'd work with simpler, less complex documents. Perhaps this will get fixed in the next release, but XMLSpy still seems substantially more difficult to use than a simple text editor like jEdit.


Speaking of jEdit, version 0.4 of the jEdit XML plug-in has been released. New features in this release include support for catalog files in the XML Catalog or OASIS SOCAT formats and a new new "XML Insert" window that lists elements and entities declared in the DTD. jEdit 3.2.1 is required. Check your jEdit Plug-Ins dialog to install.


James Tauber and Daniel Krech have released Redfoot 1.0, "an open source (BSD-license) framework for building distributed data-driven web applications with RDF and Python." Redfoot includes:

  • An in-memory RDF database
  • An RDF parser and serializer
  • A query API
  • A templating language for building RDF-driven web-sites
  • Areusable module architecture and sample modules (including schema-driven RDF editor module and RSS viewer module)
  • The beginnings of a peer-to-peer architecture for communication between Redfeet

The W3C XML Schema Working Group has has released the W3C XML Schema Test Collection. Both positive and negative expected outcomes are tested with respect to a range of core XML Schema features. More tests are forthcoming and more are desired.


Fabrice Desre has posted version 0.2 of XSLTDoc, an XSLT stylesheet documentation generator. XSLTDoc is an XSLT stylesheet that analyzes another stylesheet and builds a clean documentation on it. whiling doing some lint-style sanity checks. New features in version 0.2 include:

  • Saxon is supported
  • Semantic checks for stylesheet version and correct mode usage when applying templates
  • Duplicate variable bindings in a template are detected and reported.
  • Usage of anything else than xsl:with-param in xsl:call-template is reported.
Thursday, September 6, 2001

I've now successfully installed Internet Explorer 6 on my NT box and have been able to check out its support for XML (or lack thereof). I've also received a number of reports from readers.

First the good news: IE6 does seem to support XSLT 1.0. I haven't run it through extensive testing but it did correctly render all the simple XML + XSLT 1.0 examples I took from the XML Bible 2nd edition. It does recognize the http://www.w3.org/1999/XSL/Transform namespace. This is about two years too late, but better late than never.

Now for the bad news (and there's a lot of it):

  • At first glance, CSS support for XML does not seem to be significantly improved. For instance, display: table is still not supported.

  • Microsoft still labors under the illusion that there is a MIME media type text/xsl. IE does not recognize the actual MIME types text/xml and application/xml+xslt as identifying XSL stylesheets.

  • The XML parser built-in to IE is thoroughly broken. It accepts some malformed documents as well-formed. It rejects many real-world well-formed and even valid XML documents as malformed, most embarrassingly the first edition of the XML 1.0 specification itself.

Bottom line: any doubt that the IE team at Microsoft actually cares about standards has been erased. More than three years since XML 1.0 was released and almost two years after XSLT 1.0 was released, IE still does not correctly implement these specifications. Even though the XML parser group at Microsoft provided the IE group with a relatively standards conformant XML parser/XSLT processor, the IE programmers deliberately chose to cripple it rather than support standard XML!

The excuses that the IE team simply made a mistake in interpreting the spec, or that their software shipped before the specs were finished are no longer tenable. The only reasonable interpretation of Microsoft's actions is that the IE developers believe it's more important to maintain compatibility with broken, beta, Microsoft proprietary experiments than to support proven, reliable, standard specifications. They do not exist in a culture that rewards compliance with specifications. At most they care about compatibility with other Microsoft software. They are simply not willing to expend any effort to improve compatibility with the rest of the world. In the future, one must assume that Microsoft will implement only those parts of XML specification that they like, and that they will change. modify, extend, and break those parts of the specs they don't like. Conformance to standards is simply not a virtue in the Microsoft world.

Wednesday, September 5, 2001

The W3C has formed the Web Ontology Working Group, part of the Semantic Web Activity, to develop a language that extends

the semantic reach of current XML and RDF meta-data efforts. In particular, in a recent talk on the Semantic Web, Tim Berners-Lee, Director of the W3C, outlined the necessary layers for developing applications that depend on an understanding of logical content, not just human-readable presentation. This working group will focus on building the ontological layer and the formal underpinnings thereof.

Such language layers are crucial to the emerging Semantic Web, as they allow the explicit representation of term vocabularies and the relationships between entities in these vocabularies. In this way, they go beyond XML, RDF and RDF-S in allowing greater machine readable content on the web. A further necessity is for such languages to be based on a clear semantics (denotational and/or axiomatic) to allow tool developers and language designers to unambiguously specify the expected meaning of the semantic content when rendered in the Web Ontology syntax.

Specifically, the Web Ontology Working Group is chartered to design the following component:

  • A Web ontology language, that builds on current Web languages that allow the specification of classes and subclasses, properties and subproperties (such as RDFS), but which extends these constructs to allow more complex relationships between entities including: means to limit the properties of classes with respect to number and type, means to infer that items with various properties are members of a particular class, a well-defined model of property inheritance, and similar semantic extensions to the base languages.

    The March 2001 DAML+OIL specification, discussed in some detail in section 1.1 below serves as an example of an ontology language - a comparison of DAML+OIL to XML, XML-schema, and RDF-Schema is available.


Sun's posted the second beta of the Java API for XML Messaging (JAXM). JAXM implements the Simple Object Access Protocol (SOAP) 1.1 with Attachments. This version of the JAXM specification adds messaging Profiles "to establish a foundation for supporting a family of higher level standards-based messaging protocols. An example of a Profile would be an implementation of ebXML Transportation, Routing, and Packaging Message Handling Service or the W3C's XMLP layered on JAXM."


Microsoft's released Internet Explorer 6.0 for Windows. I haven't tried it yet. Would anyone care to comment, particularly on its XML and XSLT support? Does it finally support XSLT 1.0 in its default installation? Does it actually recognize the correct MIME media type (text/xml) for XSLT style sheets? And is CSS support improved to the level currently available in Mozilla and Opera?

Update: the first responses are coming in and it seems that Microsoft has once again deliberately chosen not to conform to XML 1.0. IE6 does use MSXML3, but it instantiates it in a "backward compatibilty mode" where it inherits MSXML's bugs including not recognizing character references after &#xFFFF; and allowing the illegal C0 control characters such as null and vertical tab. I'm still waiting to hear whether they support XSLT 1.0 or not.


James Strachan's released dom4j 1.0, an open source library for working with XML, XPath and XSLT on the Java platform using the Java Collections Framework with full integration with DOM, SAX and JAXP. This release adds:

  • Support for XML Schema Data Types via Kohsuke Kawaguchi's Multi Schema Validator library
  • Improved support for the Jaxen XPath engine, especially Jaxen's function, namespace and variable contexts
  • Improved SAX support making it easier to work with SAX EntityResolvers, XMLFilters and features and properties,
  • Assorted bug fixes
Tuesday, September 4, 2001

I'm back from Vermont, Boston, and the Software Development 2001 East show. The show was a little smaller this year following the dot-bomb, and some of the classes had very small audiences; but overall it was still a good time, and there were a few interesting developments.

Howard Katz clued me into Quip, a free-beer implementation of the XML Query Language from Software AG. It can query against the Tamino database or a local collection of XML documents. I'll probably demo this in my Cutting Edge XML Programming session at XMLOne Amsterdam in a couple of weeks. The current version is 1.5.1.

I also spent a lot of time working on my XInclude processors at the show. Nothing like having to give a demo of a product to inspire you to finish it up. I fixed some serious bugs in the SAX XIncludeFilter and added support for the detecting the encoding of text files to all three versions (SAX, DOM, and JDOM). I also improved the error reporting and handling across the package. All users should upgrade.

Of course as always seems to happen, in the course of implementing the "last feature" I noticed several things I hadn't seen before in the specification. The big ones related to XInclude's usage of the infoset, particularly unparsed entity and notation information items. If an XInclude processor is required to pass these along, then it's basically impossible to implement XInclude fully in either DOM or JDOM, because they don't expose these properties in the necessary places. SAX can do it, but not in a streaming fashion. In essence, handling XInclude requires you to build your own infoset compatible, XPointer addressable, tree-model for an XML document. Yuck. I'm hopeful that this is a result of an unclear specification document rather than the actual intent of the working group.

However, if it does turn out that full infoset support is required for XInclude handling, Lynda van Vleet and Kirill Gavrylyuk's presentation to me suggested an interesting approach. For the OASIS XSLT conformance testing project, they've defined an XML serialization of the infoset which is not, as you might think, the original XML document. For example, suppose the original XML document looks like this:

<?xml version="1.0"?>
<!-- Simple XML Document -->
<msg:message msg:date="19990421"
       xmlns:msg="http://message.example.org/"
>Phone home!</msg:message>

Then its infoset serialization looks like this:

<document>
  <children>
    <comment>
      <content>Simple XML Document<content>
    </comment>
    <element>
      <namespaceName>http://message.example.org/</namespaceName>
      <localName>message</localName>
      <children>
	    <text>Phone Home!</text>
      </children>
      <attributes>
        <attribute>
          <namespaceName>http://message.example.org/</namespaceName>
          <localName>date</localName>
          <normalizedValue>19990421</normalizedValue>
          <attributeType>CDATA</attributeType>
          <specified>true</specified>
        </attribute>
      </attributes>
      <inScopeNamespaces>
        <namespace>
          <prefix>msg</prefix>
          <namespaceName>http://message.example.org/</namespaceName>
        </namespace>
      </inScopeNamespaces>
    </element>
  </children>
</document>

The OASIS Group canonicalizes this representation so they can compare the infosets of two different documents produced by two different parsers. For XInclude, I'd use XSLT to combine and process the documents , possibly along with a couple of extension functions. Finally, the entire result could be written out as a serialized XML document. This is not nearly as efficient as my current more direct approach, but something like this might be the only reasonable way to achieve full conformance to XInclude.

Currently, the OASAIS group is using XSLT to generate this representation which misses some parts of the Infoset that the XPath data model doesn't include, particularly the unparsed entities and notation information items I'm concerned about. However, it would be easy enough to generate this from SAX.

Monday, September 3, 2001

The W3C XML Query Working Group, XML Schema Working Group, and XSL Working Group have produced the first public working draft of XQuery 1.0 and XPath 2.0 Functions and Operators. According to the abstract, "This document defines basic operators and functions on the datatypes defined in [XML Schema Part 2: Datatypes] for use in XQuery, XPath, and other related XML standards. It also discusses operators and functions on nodes and node sequences as defined in the [XQuery 1.0 and XPath 2.0 Data Model] for use in XQuery, XPath, and other related XML standards."


The W3C DOM Working Group has posted a new draft of the XPath API for DOM Level 3. The design of the API has changed and is no longer dependent on XPath 2.0.


Sunday, September 2, 2001

The W3C XSL Working Group has published a proposed recommendation of the Extensible Stylesheet Language (XSL) Version 1.0 (XSL-FO). The document lists the following 15 non-editorial changes since the Candidate Recommendation:

  1. The "master-name" property has been renamed "master-reference" on fo:page-sequence, fo:single-page-master-reference, fo:repeatable-page-master-reference, and fo:conditional-page-master-reference.

  2. The "space-treatment" property has been renamed "white-space-treatment".

  3. A definition of <time>, and <frequency> datatypes have been added to the list of property datatypes.

  4. The xsl "icc-color" function has been renamed "rgb-icc" for compatibility with SVG.

  5. The refinement section on Indents and Margins has been reworded to be clearer and an error in the formulae corrected.

  6. The conversion of <percentage> values has been clarified.

  7. The semantics of omitting an optional NCName in the core functions has been clarified.

  8. The "scaled-baseline-table" has been more clearly defined.

  9. Intrusion adjustment for xsl-side-float has been reworked in the area model and a new property introduced to control intrusions and ensure compatibility with CSS2 behaviour.

  10. "Empty paragraphs" have their space-before and space-after being resolved with the spaces of the block before and after for CSS2 compatibility. "Empty inlines" have their space-start and space-end being resolved with the spaced of the inline before and after.

  11. Small changes have been made to the properties applicable to FOs.

  12. "visibility" has been changed to be inherited for compatibility with CSS2 (erratum).

  13. Initial values have been changed for the table border precedence properties.

  14. The following values have been added to the dominant-baseline property: central, middle, text-after-edge, and text-before-edge.

  15. Initial values have been changed for the .conditionality component of the border-before/after-width, padding-before/after properties.

Review ends September 25, 2001. I'll probably wait to update the online XSL-FO chapter from the XML Bible until FOP's been updated to support the changes. Mostly these are pretty minor. I think the only ones that affect the Bible chapter are the name changes from master-name to master-reference and space-treatment to white-space-treatment.

Saturday, September 1, 2001

The W3C has posted a new working draft of XSLT 1.1 simply to note that development of this version has ceased in favor of XSLT 2.0. Specifically,

NOTICE:

As of 24 August 2001 no further work on this draft is expected. The work on XSLT 2.0 identified a number of issues with the approaches being pursued in this document; solutions to the requirements of XSLT 1.1 will be considered in the development of XSLT 2.0 [XSLT20REQ]. Other than this paragraph, the document is unchanged from the previous version.
Friday, August 31, 2001

The XML Apache Project has posted the second beta of Xerces-Java 2.0.0, an open source XML parser. This is primarily a bug fix release. However, a new XNI parser configuration interface has been added to allow the creation of "pull" parser configurations. In addition, more XNI documentation has been written that explains how to re-use the standard Xerces2 parser components. Xerces2 supports XML 1.0, Namespaces, DOM Level 2 (including events, traversal, and range), SAX 2.0, and JAXP 1.1. Xerces2 does not yet support XML Schema and has removed support for the deferred DOM implementation. Xerces2 is a nearly complete rewrite of the Xerces 1.x codebase in order to make the code cleaner, more modular, and easier to maintain. Applications using only the standard interfaces such as JAXP, DOM, and SAX should not see any differences.


Daniel Veillard's added XML Catalog support to the libxml/libxslt (1.0.3/2.4.3) XML parser/XSLT processor libraries for Linux. Several bugs in other areas have also been fixed in this release.

Thursday, August 30, 2001

Fabrice Desre has released XSLTDoc, an XSLT stylesheet documentation generator. This tool is itself an XSLT stylesheet that analyzes another stylesheet and builds a clean documentation on it. It also makes some sanity checks.


Daniel Veillard's added XML Catalog support to the libxml/libxslt (1.0.3/2.4.3) XML parser/XSLT processor libraries for Linux. Several bugs in other areas have also been fixed in this release.

Wednesday, August 29, 2001

I've posted the notes from my various presentations this week at Software Development 2001 East including:

Friday, August 24, 2001

I'm leaving for Software Development 2001 East shortly, so updates may be a little sparse here over the next week. A lot depends on what sort of Internet access I have at the conference and in my hotel room at the Sheraton Boston.


The W3C DOM Working Group has published a new working draft of Document Object Model (DOM) Level 3 Events Specification.


The W3C Speech Working Group has published the Last Call Working Draft of Speech Recognition Grammar Specification for the W3C Speech Interface Framework.. Last Call ends September 28, 2001.

Thursday, August 23, 2001

I've posted the first development version of a SAX-based XInclude program I wrote in Java. This is believed to be a complete implementation of the XInclude working draft with the exception of XPointer support. It is somewhat more complete than the DOM and JDOM versions I posted earlier. In particular it detects invalid values of the parse attribute and supports the encoding attribute for included text documents. I'll probably add these features to the DOM and JDOM versions soon.

I'll be talking about all three of these programs next week at Software Development East in Boston. Registration is still open, and we're also looking for a few more volunteers to help out with distributing class evaluations and doing general gopher work in the sessions. If anybody is interested in volunteering in exchange for free admission to the conference, (You get one day at the conference for each day you volunteer, and of course when volunterring you get to attend the sessions you're assigned to.) please email Nicole Garbolino at ngarbolino@cmp.com.

Wednesday, August 22, 2001

Converting Flat Files to XML, Chapter 4 of Processing XML with Java, has been posted. This chapter covers reading data in a variety of non-hierarchical forms including CSV text files and relational databases, and converting that data into well-formed, potentially valid XML documents. Topics covered include StreamTokenizer and StringTokenizer, XSLT, SQL, XQuery, servlets, designing tree data structures in Java, and the U.S. Federal Budget. As always, any comments you have would be appreciated. I'm particularly interested in knowing whether you can follow the examples as written or whether more explication or input data is required.


The Apache XML project has released Xerces-J 1.4.3, an open source XML parser written in Java. This is a bug fix release. According to IBM's Neil Graham "The largest difference between it and Xerces-J 1.4.2 is that the Xerces-J DOM implementation has been Reorganized to separate the Core functionality (new classes: CoreDOMImplementationImpl, CoreDocumentImpl), from the complete DOM (DOMImplementationImpl, DocumentImpl). It also incorporates general bugfixes to schema support as well as fixes to allow it to operate better on OS/390." New features are being developed exclusively in the Xerces-J 2.0 tree.

Tuesday, August 21, 2001

The joint W3C/IETF XML Signature Working Group has published a Proposed Recommendation of XML-Signature Syntax and Processing. This describes a mechanism for digitally signing all sorts of electronic files (not just XML documents) and embedding those signatures in XML documents. Comments are due by September 17, 2001.


The dbXML project has posted the first beta release of the dbXML Core XML database, an open source native XML database designed to manage large collections of small XML documents. The server supports XPath queries and provides an implementation of the XML:DB XML Database API for development of client applications. Major changes in this release include proper namespace support for XPath queries, updates for the latest XML:DB API and many bug fixes. No further feature additions are planned prior to a production release of dbXML 1.0.


IBM's alphaWorks has posted updated version of several products. The latest version of the XML Schema Quality Checker offers full support for identity constraints, fixes some bugs, clears up a few error messages, and enhances performance when validating schemas.

The XML Generator fixes a few bugs primarily involving command line use. This is a tool to generate valid XML documents from a DTD.

Finally, version 3.2.1 of the XML Parser for Java includes many bug fixes and several performance improvements. This is essentially a repackaged version of Xerces-J, which itself is based on a lot of IBM work.

Monday, August 20, 2001

Ovidiu Predescu's released XSLT-Process 2.1, a minor mode for GNU Emacs/XEmacs which adds XSLT processing and debugging capabilities. New features in this release include:

  • Processing the result of an XSLT transformation through the Apache FOP processor.
  • Integration with the DocBook-XSL project; HTML and PDF generation and viewing from within Emacs of DocBook documents
  • Support for specifying proxies and additional arguments to the supporting Java VM.

Currently the Saxon and Xalan Java XSLT processors, and Apache FOP are supported. XSLT-Process has been tested on XEmacs, versions 21.1.14 and 21.4.3, and GNU Emacs 20.7.1, under both Linux and Windows 2000. The package is free software distributed under GPL.

Saturday, August 18, 2001

The XML Apache Project has released Xalan-C++ 1.2, an open source XSLT 1.0 processor written in C++. This version of Xalan-C++ was built with, and includes Xerces-C 1.5.1. Binaries are available for Windows, Linux, Solaris, AIX, and HP/UX.

Friday, August 17, 2001

The W3C has published a first and last call working draft of the W3C Patent Policy Framework. Boiled down, what this says is that all W3C members have to tell the working groups what relevant patents they hold. There's no obligation to offer any sort of license to anybody on any terms. Frankly, this is far too weak. I would prefer a policy that stated that all W3C members must dedicate all relevant patents to the public domain, but short of that it would probably be enough to say that all W3C members must offer royalty-free licenses to all implementers. However, since the W3C is bought and paid for by its corporate members the chance of this happening is virtually nil. Comments are due by September 30.


James Clark has posted an experimental non-XML syntax for the RELAX NG XML Schema Language. According to Clark, "It's quite similar in many ways to the type syntax of the current XQuery 1.0 Formal Semantics Working Draft." A Java program that can translate this syntax into RELAX NG's XML syntax is provided.

Thursday, August 16, 2001

Norman Walsh has written some Java classes that implement the OASIS XML Catalogs Committee Specification for SAX EntityResolver and JAXP URIResolvers.

Wednesday, August 15, 2001

James Tauber and Dan 'eikeon' Krech have posted Redfoot 0.9.9, a framework for distributed RDF-based applications, written in Python. Redfoot includes an RDF database, a query API for RDF with numerous higher-level query functions, an RDF parser and serializer, a simple HTTP server, modules for viewing and editing RDF, and the beginnings of a peer-to-peer architecture for communication between different RDF databases. Acccording to Tauber, "0.9.9 should be viewed as a beta for 1.0 (to be released in the first half of September). We would appreciate as much feedback on 0.9.9 as possible to help make 1.0 as stable and easy to use as possible."


Sun's posted the Java API for XML Registries (JAXR) v0.6 Specification as well as an early access implementation. JAXR is a Java API for accessing different kinds of XML Registries for web services eventually including ISO 11179, OASIS, eCo Framework, ebXML and UDDI. Currently detailed bindings are provided for ebXML and UDDI. JAXR 1.0 is planned to become an optional package for the Java 2 Standard Edition. Commennts are due by September 13, 2001.

Tuesday, August 14, 2001

The XML Apache Project has released FOP 0.20.1. This is an open souurce XSL Formatting Objects to PDF converter written in Java. This release fixes a major bug in yestreday's 0.20.0. Most users should upgrade. You should remember that although FOP is very popular, it is far from production quality. Many XSL-FO features ar enot yet implemented, or are implemented only partially. If something doesn't work like you expect, it's probabaly a bug in FOP. Please don't bother reporting these things on non-FOP mailing lists like xsl-list or docbook-apps.


The OASIS RELAX NG Technical Committee has posted version 0.9 of RELAX NG, a schema language for XML. Both a comprehensive specification and a tutorial are available. Two months have been allocated for public comment and implementation. According to the tutorial, changes from the previous version include:

  • key and keyRef have been removed; support for ID and IDREF is now available in a companion specification, RELAX NG DTD Compatibility Annotations
  • difference and not have been replaced by except
  • A start element is no longer allowed to have a name attribute
  • An attribute element is no longer allowed to have a global attribute

IBM's alphaWorks has released version 1.5.0 of their XML for C++ parser. This is based on the Apache Xerces XML C++ Parser V. 1.5.0. It adds support for a subset of the W3C XML Schema language, fixes some bugs, and speeds up performance.

Monday, August 13, 2001

I've posted Writing XML, Chapter 3 of Processing XML with Java. This chapter teaches you how to output XML documents from your programs. Topics covered include output streams, writers, Unicode, servlets, XML-RPC clients, SOAP clients, Latin, and rabbits taking over the world. As alway, comments and corrections are much appreciated.


The XML Apache Project has released FOP 0.20.0 (web site not yet updated). This is an open souurce XSL Formatting OPbjects to PDF converter written in Java. This release is a small improvement over 0.19.0. It did fix several of the most annoying bugs I've encountered while writing Processing XML with Java including some problems with embedded images and extra indentation in the first line of code examples. However, it's still eating all the blank lines in my source code; and there do seem to be some new bugs involving external fonts so you may want to hold off upgrading for a while if you're satisfied with your existing installation.


The Institute for Applied Information Processing and Communications (IAIK) has released the IAIK XML Signature Library (IXSIL) 1.0. IXSIL is a Java class library for creating and verifying XML based digital signatures. IXSIL supports the Candidate Recommendaiton syntax of XML-Signature Syntax and Processing. IXSIL is 800 euros payware.

Sunday, August 12, 2001

The W3C Core XML Working Group has published the Proposed Recommendation of the XML Infoset. At first glance, there do not appear to be any major changes since the candidate recommendation, though I do wish working groups would publish change lists between versions.

The XML Infoset is an effort to define what is and is not significant about an XML document. For example, the content of an element is significant. Whether or not that content came from parsed text, an external entity, an internal entity, or a CDATA section is not. If this had been published right after Namespaces in XML had been released, then it might have been more useful. However, the fact is we're already awash in different models of what matters in an XML document including SAX, DOM, and XPath, all of which are subtly incompatible with each other.

Saturday, August 11, 2001

Edwin Goei's written a FAQ for the Java API for XML processing.


Microsoft's posted Internet Explorer 5.5 Service Pack 2 and Internet Tools. I haven't found the release notes yet, so I'm not sure exactly what's changed in this revision. The download page says this "includes improved support for DHTML and CSS" and "allows you to use Connection Manager as your default dialer when Dial-Up Networking is already installed." I advise caution with this release. I've already seen one unconfirmed report that this release disables Netscape-style plug-ins and that only ActiveX controls work. Henry Rzepa noticed the same thing in the beta of Internet Explorer 6.

Friday, August 10, 2001

The Apache XML Project has posted the first beta quality release of Xerces 2.0.0, an open source XML parser written in Java. The Xerces2 beta release has been greatly improved since the alpha release with an updated Xerces Native Interface (XNI) core; the addition of the parser pipeline and configuration interfaces to XNI; and completely re-written documentation, including lots of new information about XNI. Xerces2 supports XML 1.0, Namespaces, DOM Level 2 (including events, traversal, and range), SAX2, and JAXP 1.1. It does not yet support schemas and has removed support for the deferred DOM implementation. According to Andy Clark, "Xerces2 is a nearly complete rewrite of the Xerces 1.x codebase in order to make the code cleaner, more modular, and easier to maintain. Applications using only the standard interfaces such as JAXP, DOM, and SAX should not see any differences."


The SYMM Working Group at the W3C has released the final recommendation of the Synchronized Multimedia Integration Language (SMIL) 2.0. SMIL is an XML application for describing interactive multimedia presentations. "Using SMIL 2.0, an author can describe the temporal behavior of a multimedia presentation, associate hyperlinks with media objects and describe the layout of the presentation on a screen."

They've also published the first public working draft of an XHTML+SMIL profile. This adds (optionally) a subset of SMIL 2.0 to XHTML. SMIL modules inlcuded in this profile are animation, content control, media objects, timing and synchronization, and transition effects. Both DTDs and schemas for this are provided.


One new feature I missed yesterday when announcing Netscape 6.1 is support for browser side XSLT. There's still not a large enough installed base to use client side XSLT in production, but at least this will make writing books about it somewhat easier.

Thursday, August 9, 2001

Netscape's released version 6.1 of their flagship web browser for Windows, MacOS, and Unix in English, German, and Japanese. This release fully supports XML and CSS, though not yet XSLT. New features include support for Hebrew on Linux and Windows and Arabic on the Arabic version of Windows as well as autotranslation between English, French, German, Russian, Japanese, Italian, Spanish, Portuguese, and Chinese. Cookie management is much improved, as are various kinds of autocompletion in forms and the location bar.

This release is a huge improvement over the disastrous Netscape 6.0. If you're still using Netscape 4.x or IE, you may want to check it out. If you're using Netscape 6.0, you definitely need to upgrade. On the other hand I'm still partial to Mozilla.


James Strachan's posted dom4j 0.9, an open source library for working with XML, XPath and XSLT on the Java platform using the Java Collections Framework. This release adds full support for the Jaxen XPath engine, improved output via XMLWriter, JAXP and SAX filters, and various bug fixes.

Wednesday, August 8, 2001

eksmile 1.0 is a free-beer Xerces-J based XML/Schema/DTD editor.


Book Cover

The French translation of XML in a Nutshell has been published. The title is still "XML in a Nutshell". The ISBN is 2-84177-143-1. It costs 39 euros, and should be available from all the usual sources of French books including amazon.fr. I do notice, however, that Chapters seems to have stopped selling non-English books at their online store. Can any one recommend a good online store for French-language books in North America? Update: Aaron Staup Cope suggested I check out Camelot. They do sell French-language computer books, but they don't seem to have the French translation of XML in a Nutshell in stock yet.


Tuesday, August 7, 2001

The WAP Forum has released a draft of the WAP 2.0 architecture specification for public review. WAP 2.0 will be based on XHTML and CSS, rather than the current WML, though some sops will be thrown to backwards compatibility. WAP 2.0 also uses straight TCP/IP, HTTP/1.1, and TLS rather than WAP's existing custom protocols. This is all good and should help open up the WAP world to a broader variety of content providers. However, none of it addresses the fundamental flaw in WAP: cell phones are designed to support audio not text. Until this is fixed, WAP will continue to fail.

On the not so good side, WAP 2.0 promises that "Push technology allows trusted application servers to proactively send personalized content to the end-user, such as a sales offer for a product a person might be interested in buying, a new email notification, or a location-dependent promotion." In other words WAP 2.0 is spam-enabled.


Norm Walsh has posted version 1.4.2 of his XSLT stylesheets for Docbook. This is primarily a bug fix release. I've been using these to produce both the HTML and XSL-FO versions of Processing XML with Java.

Monday, August 6, 2001

Mozilla 0.9.3 has been released for the usual batch of platforms. This is primarily a bug fix release. I found Mozilla 0.9.2 to be a step backward from Mozilla 0.9.0. Hopefully, this release restores some of the stability that was lost in 0.9.2.

In related news, Galeon is a Gecko based browser specifically for Linux Gnome. Galeon uses the Gecko rendering engine, but replaces most of the rest of the browser built on top of that. It focuses on providing just a browser, no email client, chat program, news reader, calendar manager, or food processor. It's just a browser, and consequently loads much faster than megaware programs like Mozilla and IE. Because of licensing problems, it does require that you have Mozilla installed separately, however. It's not quite ready for nondeveloper end users yet, but it looks very promising for the future.


Version 0.5 of of the open source eXist XML database has been released. XML is either stored in the internal, native XML-DB or an external realtional database. The search engine has been designed to provide fast XPath queries, using indexes for all element, text and attribute nodes. This release introduces a pure Java implementation of the native XML storage backend, which is better optimized for efficiently storing, indexing and retrieving XML. The server now provides access by XML-RPC calls as well as HTTP, supports document collections and integrates well with Cocoon2.

Sunday, August 5, 2001

Norm Walsh and O'Reilly have released DocBook: The Definitive Guide under the GNU Free Documentation License. The current version is a "work in progress"; but mostly covers document DocBook XML V4.1.2 with the EBNF, HTML Forms, MathML, and SVG modules. I've been using this as a reference while writing Processing XML with Java, which is itself written in Docbook XML 4.1.2.

Saturday, August 4, 2001

The W3C has published working drafts of SVG 1.1/2.0 Requirements and SVG Mobile Requirements. I'm pleased to see that the requirements include "SVG may allow CSS units in the polylines, polygons, paths and transforms." and "To allow or include relevant enhancements from target domains such as GIS/Mapping, CAD/Design, Mobile, Printing and Web Design." Hopefully, someone on the working group realize that CAD and GIS Mapping require the ability to define sizes in real world units rather than onscreen units. There's a lot of other neat stuff here too, like rotations and word wrapping. I just pray that backwards compatibility with the brain damaged coordinate system in SVG 1.0 doesn't prove to be too big a millstone around SVG 2.0's neck.

Friday, August 3, 2001

ElCel Technology released version 0.12 of their XML Validator and Canonical XML Processor, command-line applications for Windows and Linux. This release adds HTTP proxy server support, more character encodings, and an up-to-date implementation of the latest OASIS XML catalog specification.


Bob Mcwhirter has posted SAXPath beta 4. SAXPath is an API (with default XPathReader implementation) for generic XPath parsing, which reports grammar productions in the form of call-back events. SAXPath is to XPath as SAX is to XML.


Roger L. Costello and Roger Sperberg have published an ISBN xsd:simpleType definition. It defines the legal ISBN values for every country in the world.

Thursday, August 2, 2001

The Text Encoding Initiative Consortium (TEI-C) has posted the official review draft of version 4 of Guidelines for Electronic Text Encoding and Interchange. The third edition, known as "P3", has been heavily used since its released in April of 1994 for developing richly encoded and highly portable electronic editions of major works in philosophy, linguistics, history, literary studies, and many other disciplines. The fourth edition, "P4", will be fully compatible with XML, as well as remaining compatible with SGML. Comments are due by mid-September.


The Institute of Medical Informatics has released a new version of their DTD to XML Schema translator and xsbrowser tool for creating human readable documentation of XML document types represented from a schema.

Wednesday, August 1, 2001

The W3C CSS Working Group has posted the first public working draft of CSS3 module: Fonts. New properties addes beyond those of CSS2 include font-effect (effects include emboss, engrave, and outline), font-smooth (whether or not to antialias fonts), font-emphasize-style, font-emphasize-position, and font-emphasize.


James Strachan's posted version 0.8 of dom4j. dom4j is an open source library for working with XML, XPath, and XSLT on the Java platform using the Java Collections Framework with full integration with DOM, SAX and JAXP. This is a bug fix release.

Tuesday, July 31, 2001

Peter Murray-Rust and Henry Rzepa have announced a new suite of tools and demonstrations for the Chemical Markup Language (CML). These include:

  • JUMBO3-J,a molecular browser written in Java that can read CML files and a wide range of legacy molecular formats.
  • An XML-CML to SVG converter
  • A CML aware version of Peter Ertl's Molecular Editor (JME)
  • A new FAQ for CML

JUMBO was the first XML browser/editor and CML was perhaps the first serious XML application. However, they've always been severely hampered by a lack of documentation. This release is much improved in this area, but it's still not complete. The new FAQ is very helpful, but all the links to further documentation for the individual elements such as atom seem to be broken.

Monday, July 30, 2001

I've been spending a lot of time lately with Docbook and XSL-FO as part of the ongoing development of my next book, Processing XML with Java. To that end, I've been putting the various XSL-FO engines on the market through their paces. I'm trying to find one that will actually let me produce the complete, finished book from my Docbook source code and Norm Walsh's XSLT-to-XSL-FO stylesheet. I thought I'd share my experiences here.

So far, I've experimented with four different XSL-FO processors: the Apache XML Project's FOP, Sebastian Rahtz's PassiveTeX, the Antenna House XSL Formatter 1.1E, and RenderX's XEP. Two are implemented in Java, one in native Windows code, and one in TeX. FOP and PassiveTeX are open source. Antenna House and XEP are payware. Here are my experiences with each:

FOP

FOP was the first XSL-FO engine and is certainly the most popular. It's open source and far easier to install than PassiveTeX, the other open source alternative. However, of the ones I was able to actually test it produced by far the worst output. It had the most annoying formatting troubles. For example, it ate all the blank lines in my source code examples and put extra indentation at the front of the first line of each example. I've noticed that probably more than half of the bug reports on the Docbook-APPS mailing list about the Docbook XSL-FO stylesheets can actually be attributed to bugs in FOP. FOP is improving rapidly -- one major bug I noted in footnote handling was fixed in the last couple of weeks while I was performing my tests -- but it's clearly not even an alpha quality release yet. A lot of work needs to be done before FOP can be recommended for more than experimentation.

XEP

I was unable to get XEP to run. It was totally non-functional, and did not produce any output. I know some other people have gotten it to run -- the PDF version of the XSL specification was produced with XEP. However, it simply did not work for me at all. However good the XEP engine may be at converting XSL-FO documents to PDF, its horrible user interface and incomprehensible installation procedure eliminated it from my consideration.

PassiveTeX

PassiveTex did a very good job formatting most of my document. There were a few issues involving improperly scaled images, but those were easily fixed by adding some width attributes to my source XML document. Once that was done, the only major bug was a failure to properly calculate page numbers in the table of contents. There was also one quirky instance where the first bullet point in a list was not indented quite right, but this didn't seem to occur in other bulleted lists.

The downside to PassiveTeX is that it depends on a "decent modern TeX setup"; and TeX is invariably a nightmare. If my Linux distribution hadn't included TeX by default, I would have been lost. As it is, I consider myself lucky to have been able to get PassiveTeX running; and it still fails one time out of every two. This is probably due to TeX's unusual multipass architecture. You sometimes have to run TeX a second time to get the links and cross-references right. In my case, the first pass succeeds but the second pass invariably fails. Thus I never get proper cross-references to page numbers in the table of contents and elsewhere. Otherwise, the output produced is quite attractive

Antenna House XSL Formatter

The Antenna House XSL Formatter produced very attractive output, on a par with that generated by PassiveTeX and much better than FOP's. I noticed no major flaws or cosmetic bugs. Antenna House also claims they're the only formatter able to handle mixed writing-modes such as "tb-rl" for Chinese/Japanese/Korean, though I didn't test that.

Most importantly, Antenna House had by far the easiest installation and the nicest user interface of all the formatters tested. More work is still needed, but at least I could conceive of giving this formatter to a non-programmer end-user. The others all have effectively non-existent user interfaces, and horrible installation procedures. The Antenna house formatter was the only one of the four that took me less than an hour from download to first use.

The downside to this otherwise excellent engine is that it's Windows only and based on Windows graphics primitives rather than PostScript or PDF. It displays on the screen very nicely, and prints nicely too. However, it does not produce a PDF document that I can send to my editor or a typesetter.

Bottom line: none of the formatters are yet suitable for producing a finished product. None of them can replace TeX or QuarkXPress. You might be able to publish a simple book with these, but you'd have to design your book and style sheet so that you avoided the bugs and unimplemented features of the processor. Antenna House probably produces the most polished output, and I'd use it if all I wanted to do was print out a document from my laser printer. However, since I need PDF files I can send to my editors and download to a typesetter, my choice for the time being is PassiveTeX.

Sunday, July 29, 2001

Sun's released the Sun Multi-Schema XML Validator, a Java tool that validates XML documents against several kinds of XML schemas. It supports XML DTDs, TREX, RELAX Namespace, RELAX Core, RELAX NG and a subset of W3C XML Schema language. It can be used through a command-line interface or as a class library from inside your own programs.


The TM4J Project team has posted TM4J 0.5.0, an open source (Apache license) topic map engine written in Java. This class library provides a simple set of interfaces with which topic maps may be created and manipulated as well as imported from and exported to XML files using the XTM syntax.

Saturday, July 28, 2001

Version 1.95.2 of Expat, a popular non-validating XML parser written in C, has been released. Version 1.95.2 fixes some small bugs and now builds on Windows as well as Unix.

Friday, July 27, 2001

The W3C CSS Working Group has published an initial working draft of CSS3 module: the box model. This is a direct outgrowth of the CSS Level 2 box model. There are a few new properties, mostly to deal with vertical writing. These include display-model, display-role, and float-displace as well as new :expanded and :collapsed pseudo-classes. In addition, some CSS2 properties have been subdivided into more fine-grained properties, though the old CSS2 properties remain as shorthands.

Thursday, July 26, 2001

Ronald Bourret's posted a list of XML data binding resources based on some initial work by Sean Sullivan and Brendan Macmillan.


Altova has posted a public beta of XML Spy 4.0, a $199 payware XML IDE for Windows. New features in version 4.0 include expanded ODBC database access functionality, enhanced user interface customization, a new plug-in architecture for 3rd party developers, support for the final XML Schema Recommendation, and a more WYSIWYG editor. As usual, beta testers who volunteer their time and systems to help debug XMLSpy will most likely receive bupkus for their efforts, and still have to pay $199 for the final release.

Wednesday, July 25, 2001

I've posted XML Protocols, Chapter 2 of Processing XML with Java. This chapter covers XML-RPC, SOAP, and related technologies. As always your comments are much appreciated. Just email them to me at elharo@ibiblio.org. Thanks!

Tuesday, July 24, 2001

The Apache XML Project has released version 1.4.2 of the open source Xerces-J XML parser for Java. This release fixes numerous bugs, particularly with regard to schema support.

Monday, July 23, 2001

Michael B. Allen's posted domc 0.3, an opens source implementation of the Document Object Model Level 1 (DOM1) in ANSI C. It depends on the the Expat XML Parser Toolkit.


The Apache Cocoon team has posted the 2nd beta of Apache Cocoon 2.0. "Apache Cocoon 2.0 is a complete rewrite of the Cocoon XML publishing framework that is supposed to remove all those design constraints that emerged from the Apache Cocoon 1 experience....This release marks API stability of the project. The next release will primarily focus on documentation."


James Strachan's posted version 0.7 of dom4j. dom4j is an open source library for working with XML, XPath, and XSLT on the Java platform using the Java Collections Framework with full integration with DOM, SAX and JAXP. This release adds support for the SAXPath API for the parsing of XPath expressions and fixes various bugs.

Sunday, July 22, 2001

The W3C Synchronized Multimedia Working Group has published the Proposed Recommendation of SMIL Animation. This spec describes "an animation framework as well as a set of base XML animation elements suitable for integration with XML documents. It is based upon the SMIL 1.0 timing model, with some extensions, and is a true subset of SMIL 2.0." The review period ends August 16, 2001.

Saturday, July 21, 2001

IBM's alphaWorks has released UDDI Registry, a "UDDI-compliant registry for Web services in a private intranet." It runs on Windows NT, 2000, and Linux.


Netscape's released Communicator 4.78. This is a minor update that improves mousewheel support on Windows, improves support for Sun Java plug-in and runs on Solaris 2.8 and AIX 5.x. A few bugs are fixed. In addition, on Windows AOL Instant Messenger is upgraded to version 4.3 Flash Player to version 5. There's no public support for XML as is customary in the 4.x Netscape series.

Friday, July 20, 2001

The W3C SVG Working Group has published the Proposed Recommendation for Scalable Vector Graphics (SVG), an XML application for two-dimensional line art such as cartoons, blueprints, technical drawings, businenss graphics, or anything else you'd produce with Visio/KIllustrator/CorelDraw/PowerPoint/etc. There's no change document, and I haven't read through the entire spec yet, but I don't expect the changes are too major.

I am disapoinnted that they didn't change the underlying model for how coordinates are measured. I didn't really expect them to. Nonentheless I think the definitions used for pixels, inches, user units, and the whole coordinate space are feet of clay for the rest of the specification. They've already been proven to confuse developers, and I think they're going to make SVG less generally useful than it should be. In brief, it's impossible to assign an absolute size to a picture. I cannot say that the chair I draw is one meter tall and half a meter wide. Thus there's no way to guarantee that the chair I draw will fit in the door of the house you draw. What I think is needed is a way of specifying the actual physical dimensions of a picture, and a preferred scale factor for moving that drawing onto the screen. The model they've chosen is suitable for single-file line art on Web pages, but not for more complex technical drawings and pictures that are built from many different pieces by many different artists. Some of the problems I noted are under consideration for changes in SVG 2.0. However, the problems are rooted so deep in the foundations of SVG that I don't know that they can be fixed in a backwards compatible way.

Thursday, July 19, 2001

I am looking for native speakers of Khmer, Burmese, Amharic, or the other Ethiopic languages who have some experience with XML and who are willing to answer a few questions about your use of XML and the need for native markup in these languages. Alternately, if you're not a native speaker, but you have spent a significant amount of time in Cambodia, Myanmar, Ethiopia, or Somalia working with or teaching computer related technologies, I'd also like to talk to you. If you fall into any of these categories, please drop me a line at elharo@ibiblio.org, and I'll send you my questionnairre. Your answers will be used as input for the Blueberry efforts at the W3C. Thanks.


IBM's alphaWorks has updated their P3P Policy Editor to fix "numerous installation problems" and add a platform-native program to launch the P3P editor.

Wednesday, July 18, 2001

Does anybody know a quick way to specify in a W3C XML Schema Language schema that an element (call it value) can have type xsd:double except for Inf, -Inf, and NaN. In other words I want to restrict it to all legal IEEE 754 values except Inf, Nan, and -Inf. I could use a pattern, but frankly that seems like overkill, especially given all the different forms floating point numbers can take. In essence, i'm looking for a reverse union. I want to say that every xsd:double except these three values is allowed. Suggestion to elharo@ibiblio.org. Thanks!

Tuesday, July 17, 2001

Michael Kay's released version 6.4.3 of his SAXON XSLT processor. This is a bug fix release.

Monday, July 16, 2001

Nenie XML 0.1 is a non-validating XML parser for Eiffel. Unlike eXml, Nenie XML is written in pure Eiffel. Nenie XML does not support SAX or DOM, relying on its own API instead.

Sunday, July 15, 2001

The W3C CSS Working Group has published drafts of two new modules for CSS Level 3, Values and Units and Cascading and inheritance. The Values and Units module "describes the various values and units that CSS properties accept. Also, it describes how 'specified values', which is what a style sheet contains, are processed into 'computed values' and 'actual values'." This version of the module does not introduce any new features, merely rewriting the equivalent parts of CSS level 2 in the form of a CSS3 module. However, new features are planned for a future version. The Cascading and inheritance module "describes how values are assigned to properties. CSS allows several style sheets to influence the rendering of a document, and the process of combining these style sheets is called 'cascading'. If no value can be found through cascading, a value can be inherited from the parent element or the property's initial value is used."

Saturday, July 14, 2001

Ken North reports that Camelot Communications has laid off its staff and cancelled all future events including XML DevCon London, and iFive (XML DevCon, SQL Summit, Web Services Summit) later this year.

Friday, July 13, 2001

The W3C/IETF joint XML Digital Signature Working Group has posted the first working draft of Exclusive XML Canonicalization Version 1.0. This is a revised canoncial form that is supposed to allow individual elements to be cut and paste from one document to another while maintaining the same canoncial form. Currently, the canonical form of an element can change based on namesapces in scope that are not actually used. Thsi is a problem for applications that want to sign only a part of a document such as the Body of a SOAP message rather than the entire document. The problem here is real. It's not, however, clear to me that the propsed solution actually works in the face of namespace declarations used in attribute values (XSLT, Schemas) and element content (SOAP).


IBM's alphaWorks has updated their XML Parser for Java with support for the W3C XML Schema Recommendation 1.0. It also includes updated JAXP support and other enhancements. To a very good approximation, this is a repackaged version of Xerces-J (which IBM did most of the work for anyway.) In brief, if you want to pay IBM to support your parser, use this. Otherwise use Xerces-J.

Thursday, July 12, 2001

Eric van der Vlist has released an RTF output method for the open source XT XSLT processor. This is a very simple serializer that writes a XML representation of RTF as a RTF document.


Opera Software has posted a beta of the Opera 5.12 web browser for OS/2. Among many other features Opera boasts native support for XML and pretty good support for CSS Level 2.

Wednesday, July 11, 2001

libxslt, the Gnome XSLT library, has reached version 1.0.0. libxslt is believed to implement all the XSLT-1.0 constructs, support a few "common" extensions and provide an extension framework. It also includes a simple to use command line interface 'xsltproc' with an XSLT profiler. This is free software available under the LGPL and an alternative that allows it to be easily embedded in commercial products. It is written in C and should port easily to most Unixes and perhaps Windows.

Wednesday, July 11, 2001

IBM's alphaWorks has updated their XML Schema Quality Checker to version 1.1.85 to fix various bugs and improve Solaris 2.7 and Windows 98 usability.

Tuesday, July 10, 2001

Michael Kay's released version 6.4.2 of his open source SAXON XSLT processor written in Java. This release fixes a few bugs, and probably introduces some new ones. The big new feature is the EXSLT dates-and-times module.

Monday, July 9, 2001

The W3C XML Protocol Working Group has published two new working drafts on the XML Protocol Abstract Model and SOAP Version 1.2. SOAP is "an XML based protocol that consists of four parts: an envelope that defines a framework for describing what is in a message and how to process it, a set of encoding rules for expressing instances of application-defined data types, a convention for representing remote procedure calls and responses and a binding convention for exchanging messages using an underlying protocol. SOAP can potentially be used in combination with a variety of other protocols; however, the only bindings defined in this document describe how to use SOAP in combination with HTTP and the experimental HTTP Extension Framework." I'm disappointed that this draft fixes exactly none of the problems I noted in SOAP 1.1 including:

  • The proper namespace is not required.
  • Document type declarations are explicitly forbidden. (The spec is actually contradictory on this point.)
  • Processing instructions are forbidden.

However, perhaps I can push to get some of this fixed before SOAP 1.2 goes gold. I'm writing Chapter 2 of Processing XML with Java right now, which is all about XML Protocols and SOAP, so I'm spending a lot of time thinking about this sort of stuff.

Sunday, July 8, 2001

Beta 7 of JDOM has been posted. JDOM is a Java-centric API for processing XML documents. This release adds:

  • XSLT support via JAXP
  • A correctly implemented entity model
  • A vastly more robust SAXBuilder (with a public SAXHandler, pluggable factories, and createXXX() methods to simplify subclassing)
  • More reliable building and outputting
  • A convenient detach() method for all tree objects
  • The much-desired setName() and setNamespace() methods on Element
  • Machine-readable version information in META-INF
  • Significant performance improvements
  • Many bug fixes

The API is in general not compatible with beta 6, so you will need to rewrite your code to upgrade to beta 7. The same basic functionality is there so mostly it's a matter of changing some method names.

Saturday, July 7, 2001

The W3C's released Amaya 5.0, their open source testbed web browser and HTML editor for Windows, Solaris, Linux. Version 5.0 supports HTML 4.0, XHTML, CSS, SVG, MathML. Annoyingly, it still doesn't support XML or XSLT directly, only specific XML applications like XHTML and SVG.

Friday, July 6, 2001

Software AG's released QuiP, a prototype of XQuery, the W3C XML query language. QuiP can be used either with text-based XML files or for queries against the Tamino native XML database. Quip conforms to the June 7, 2001 working draft of XQuery. It sample queries and data files, syntax diagrams in the online help, and a GUI.


Howard Katz has released XML Query Engine 1.0, a $625 full-text search engine component for XML based on XQuery. XML Query Engine is a compact (roughly 160K), embeddable component written in Java. It has a straightforward programming interface that lets you easily call it from your own Java application. Collections to be searched are limited to 32,768 documents, each of which can contain up to 32,768 elements.


The XML Benchmark Project has released Xmark, a framework intended to help analyze XML query processors. The benchmark consists of

  • An application scenario which models an internet auction site,
  • The data generation tool xmlgen to generate the benchmark document at different scaling factors, and
  • 20 XQuery challenges designed to cover essential aspects of XML query processing.
Thursday, July 5, 2001

For the second chapter of Processing XML with Java, I'm looking for simple examples of public services that expose an XML-RPC interface. The ideal service would be one that accepted a stock symbol and returned the current price. Anything else that was equally simple would also be helpful. (I found such a thing in SOAP but not XML-RPC.) I'm aware of the services at UserLand and Meerkat. Anything else you can point me to would be appreciated. As usual, just email suggestions to me at elharo@ibiblio.org. I've already received a number of very helpful comments on and corrections to the first chapter that I'll be incorporating soon.


Late Night Software's released XML Tools 2.2, an expat based XML parser for Applescript. This release adds native support for MacOS X.

Wednesday, July 4, 2001

Michael Kay's released version 6.4 of his SAXON XSLT processor written in Java. This release adds support for JDOM, speeds up transformations by about 20%, and restores compatibility with Java 1.1 and the Microsoft VM. Thus Instant SAXON (the Windows executable version) works again.


James Strachan has released version 0.6 of dom4j, an open source library for working with XML, XPath and XSLT in Java. I've received a couple of requests to cover dom4j in Processing XML with Java, and I'm thinking about it. It may depend somewhat on time and page count. SAX, DOM, and TRAX are the must-carries for this book. JDOM is probable. Other APIs will depend on available space and reader interest. I would like people's input on what to cover beyond the basics, though please realize I probably won't have the time or space to cover everything. Thus adding dom4j, ElectricXML, or other APIs might mean dropping JDOM. Also, please keep in mind that topics not specifically related to writing Java programs to process XML are out of scope for this book and won't be covered. Thus I will not, for example, cover XPath 2.0 or XQuery. You can always email your suggestions and commments to me at \.

Tuesday, July 3, 2001

Rick Jelliffe's released the Topologi Schematron Validator, a GUI interface for validating XML documents against DTDs, Schemas, and Schematrons. This product runs only on Windows and MSXML 4.0 is required.

Monday, July 2, 2001

I'm pleased to announce my latest book project, Processing XML with Java. I'm more excited by this project than I've been by any book in a long time. This is a comprehensive tutorial covering all aspects of Java programs that read and write XML documents. It picks up where the XML Bible left off, and covers SAX, DOM, JDOM, XML-RPC, SOAP, JAXP and as many other acronyms as I can cram into seven hundred pages.

Processing XML with Java will be published by Addison-Wesley next Spring. So why am I announcing it now? Because the entire book is going to be written and posted here on Cafe con Leche first! Every word, every figure, every piece of code is going to be published in HTML right here. Chapter 1, XML for Data, is available now. Other chapters will follow about one per week. The entire book will be updated and corrected several times a day as I write it.

There's no fee or registration required to read the book. Just start reading at page 1 and go from there. I do hope you'll send in any comments or corrections you have as you read, and I certainly hope you'll consider buying a paper copy next year when the book is released; but for now the book is completely free. In the immortal words of Abbie Hoffman, "Free means you don't have to pay."

Sunday, July 1, 2001

Opera Software has released version 5.12 of its namesake Opera web browser for Windows. This is mostly a bug fix release witha few minor new features. Opera features pretty good support for XML+CSS.

Saturday, June 30, 2001

Mozilla 0.9.2 has been released with the usual support for XML, CSS, XSLT, etc. Mostly this is a bug fix release. I've been using Mozilla as my primary browser on Windows for a couple of months now, and I've been quite satisfied with it. Maybe it's time to switch my Mac over as well. Can anyone comment on the Applescript support in Mozilla for the Mac?


Log Markup Language (LOGML) is an XML application for describing log reports of web servers based on the eXtensible Graph Markup and Modeling Language (XGMML).

Friday, June 29, 2001

The W3C XML encryption working group has published the first public working drafts of XML Encryption Syntax and Processing and Decryption Transform for XML Signature.

XML Encryption specifies a process for encrypting data using any algorithm or key length and representing the result in XML. The data encrypted can be any binary data at all, though XML documents, individual XML elements, and XML element content can also be encrypted. The encrypted data is stored in an EncryptedData element which either contains the Base-64 encoded encrypted data or points to it with a URI. A "decryption transform" allows verification of XML Signatures verification when both signature and encryption operations are performed on an XML document.

Thursday, June 28, 2001

The W3C has approved XML Base and XLink as official recommendations. I haven't read every word of the specs yet, but so far it doens't look like they're any significant changes since the proposed recommendation. Everything in the XLink Chapter from the XML Bible, second edition, should still be accurate.


IBM's released version 3.5.0 of their XML for C++ parser. This supports SAX2 and DOM2.

Wednesday, June 27, 2001

Sun's posted version 0.92 of the Java API for XML Messaging (JAXM) Specification. (PDF format as usual.) This is baiscally a Java API for SOAP 1.1.

Monday, June 25, 2001

Microsoft's posted a public beta of Internet Explorer 6 for Windows. This release features support for numerous XML and XML-based technologies including P3P, DOM1, CSS1, XSLT 1.0 (allegedly) and more. Windows 98, NT 4.0, or Windows 2000 is required. Windows 95 does not seem to be supported.


The W3C User Agent Accessibility Working Group has posted new drafts of User Agent Accessibility Guidelines 1.0 and Techniques for User Agent Accessibility Guidelines 1.0.

Sunday, June 24, 2001

I've updated my DOM-based XIncluder to fix a bug involving including XML documents with parse="text". The fix is implementation only. The API is unchanged. In brief some < and & characters were getting double escaped so they came out as &amp;lt; and &amp;amp; instead of < and &.

Saturday, June 23, 2001

The XML Apache Project has released version 1.4.1 of the Xerces-J XML parser (my parser of choice.) Besides bug fixes, 1.4.1 adds:

  • support for all IANA encoding aliases which have a clear mapping to encodings recognized by Java
  • Improved DTD validation performance crealtive to 1.4.0
  • Support for JAXP's setAttribute/getAttribute
  • Two new SAX properties permitting an application writer to associate schema documents with specific namespaces without relying on xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes in the instance documents.
Friday, June 22, 2001

James Strachan's posted dom4j 0.5. dom4j is an open source library for working with XML, XPath and XSLT on the Java platform using the Java Collections Framework with full integration with DOM, SAX and JAXP. This release fixes assorted bugs and adds:

  • A NodeComparator for comparing documents by value
  • A variety of new helper methods such as DocumentHelper.parseText( String );
  • Branch.normalize() method for normalizing Text nodes
  • New builder methods such that 'Element Construction Set' style methods can be used to create documents; e.g.:
    Element author = element.addElement( "author" )
      .addAttribute( "name", "James" )
      .addAttribute( "location", "UK" )
      .addText( "James Strachan" );
    
Thursday, June 21, 2001

The W3C XML Core Working Group has posted the first public draft of XML Blueberry Requirements. This is a proposal for a new backwards incompatible version of XML. The specific goal is to address some shortcomings of the XML 1.0 character model relative to Unicode 3.1, as well as throwing a sop to IBM.

The concern with respect to IBM is that one of the world's largest corporations, with thousands of patents, legions of programmers, billions of dollars in revenue, and resources pouring out of every orifice is somehow unable to handle documents where lines end with carriage returns and line feeds, as documents do on every non-IBM system on the planet. The only reason there's a problem here at all is because IBM tried to go it alone as a monopoly and set standards by fiat for years rather than working with the rest of the industry. Consequently their mainframe character sets don't really interoperate well with everybody else's character sets. In XML this arises as a problem with line endings when someone edits an XML document with an IBM mainframe text editor. IBM mostly grew out of their anti-competitive monopolistic tendencies over the last thirty years (with a large dose of assistance from the U.S. government). However, there are still some legacy issues relating to their attempt to dictate standards to the rest of the industry, and this is one of them. Now rather than fixing their own broken mainframe text editing software, they want everyone else on the planet to change their software so IBM doesn't have to. (If this reminds anybody of the current mess with Oracle and UTF-8, you're not alone.) This proposal was laughed out of the W3C a few months ago when IBM made it, or at least it seemed to be. However, it's now risen from the dead as part of XML Blueberry; but it doesn't make any more sense now than it did then; and it still deserves to be laughed off the table with whooping cries of derision.

The second proposal for breaking backwards compatibility with existing parsers is much more serious, and requires a more thoughtful response. Starting in Unicode 3.0 a number of new characters have been added both for new scripts that were previously unencoded such as Amharic and Cherokee as well as for old scripts that were incomplete such as Chinese. The concern is that since XML 1.0 is based on Unicode 2.0, "fully native-language XML markup is not possible in at least the following languages: Amharic, Burmese, Canadian aboriginal languages, Cantonese (Bopomofo script), Cherokee, Dhivehi, Khmer, Mongolian (traditional script), Oromo, Syriac, Tigre, Yi. In addition, Chinese, Japanese, Korean (Hangul script), and Vietnamese can make use of only a limited subset of their complete character repertoires."

If this were true, it would be a very serious criticism of XML 1.0 Fortunately, however, the claim is not nearly as dire as the proposal makes out. Indeed the proposal substantially overstates the need for any changes. The XML 1.0 BNF productions do not allow these newly defined characters to be used in element, attribute, and entity names. However, they can be used in the text of element content and attribute values. This means that XML is fully adequate for literature and data in Amharic, Burmese, Canadian aboriginal languages, Cantonese, Cherokee, Dhivehi, Khmer, Mongolian, Oromo, Syriac, Tigre, Yi, Mandarin, Japanese, Korean, and Vietnamese. Only the markup, that is, the tags, would have to be written in another script. Given that there aren't even localized operating systems in most of these languages, and that today's software effectively requires users to have a solid knowledge of at least the ASCII characters, I don't think the need to write markup (as opposed to text) in Cherokee justifies breaking backwards compatibility.

But wait! It's not even that bad. Several of the languages listed are total red herrings. You most certainly can write markup in Cantonese, Japanese, Korean, Mandarin, and Vietnamese today. The new characters Unicode has added to these scripts are very obscure. In fact, experts often disagree over whether some of them exist at all, or are merely typographical variations of existing characters. Since the 1700s Vietnamese has been written in a Latin-based alphabet that is fully available in XML and that can write any Vietnamese word. Vietnamese only uses the Han ideographs for classical documents and occasional signage or decoration, and it seems very unlikely that a Vietnamese speaker would write their markup using Han ideographs. Japanese has not one but two phonetic alphabets that can write any Japanese word if the right Han ideograph character is not encoded. Chinese speakers can use either Latin characters or the native Bopomofo phonetic system for the very rare cases where a character they need is not encoded. The fact is most native speakers of Chinese, Japanese, Korean and Vietnamese do not recognize the vast majority of these new characters, and the need for them in markup (again, as opposed to text) is non-existent.

There are a few good points in this proposal. I'm sure there's an occasional need for writing markup in Amharic, Burmese, Khmer, Mongolian, Yi, and a few of the other languages the proposal lists. But I don't believe there's enough of a need to justify breaking compatibility with existing XML parsers, software, and systems. The XML Blueberry Requirements vastly overstate the case by ignoring the difference between markup and text in XML documents. I'd be willing to break backwards compatibility to allow text in these languages if we had to, but we don't. Text is already adequately handled by XML 1.0. All we're arguing about now are the tags, and that's just not a strong enough reason to break backwards compatibility.


I've posted Chapter 24, Schemas, from the XML Bible second edition. This updated version corrects some errors from the printed book as well as adding additional new material about binary data types, simple type derivation, and attribute declarations. Overall, this is a complete introduction to the W3C XML Schema Language. Enjoy!

Wednesday, June 20, 2001

Elcel Technology has posted version 0.11 of their XML command-line tools, an XML Validator and Canonical XML Processor. These run on Windows and Linux platforms. Version 0.11 adds support for entity resolution using XML catalogs. Registration is required.


The United States National Institute of Standards and Technology (NIST) has published published several hundred tests for XSLT, XPath, and XSL Formatting Objects. These will eventually be integrated into the official OASIS XSLT/XPath suite.


Adobe's localized their SVG Viewer browser plug-in to eleven new languages. It now supports English, Danish, French, Italian, Spanish, Dutch, Norwegian, German, Swedish, Portuguese, Chinese, Japanese, Korean, and one language I couldn't identify, Suomeksi. (Update: Greg Phillips tells me that "Suomeksi" is the Finnish word for "Finnish".)


Matt Seargent's released version 1.4 of AxKit, an application server module for Apache, designed for building web sites using server-side transformations of XML based content. It contains many features for delivering the same content to different devices, and also for building dynamic content for different devices. It features a number of transformation and language modules including XSLT, Apache XSP and XPathScript. AxKit also implements a high performance cache architecture that ensures cached content can be delivered at close to static HTML speeds.


CL-XML is a collection of Common LISP modules for XML parsing and serialization. XPath and XQuery modules are also included.


Ovidiu Predescu's XSLT-process is a minor mode for GNU Emacs/XEmacs which transforms it into a powerful editor with XSLT processing and debugging capabilities. With this mode you can:

  • Run an XSLT processor on the Emacs buffer you edit, and view the results either in another Emacs buffer or in a browser.
  • Run an XSLT processor in debug mode and view what happens during the XSLT transformation. You can set breakpoints, run step by step into your stylesheet, and view global and local XSLT variables.

Either SAXON or Xalan-J is required.

Tuesday, June 19, 2001

The dbXML project has released version 0.9 of the dbXML Core XML Database, a native XML database designed to manage large collections of small XML documents. The server supports XPath queries and uses an implementation of the XML:DB XML Database API for development of client applications. Features added include an XML:DB XUpdate implementation for updating XML documents, SAX support in the API and a compression mechanism to increase performance between the client and the server. The server is now considered feature complete. Source code for the dbXML Core is now available under an Apache style license.


The W3C Document Object Model Working Group has posted the first public Working Drraft of Document Object Model (DOM) Level 3 XPath Specification Version 1.0. This describes a standard XPath interface for DOM parsers. There isn't as much here as I'd hope. There are methods for converting DOM nodes to XPath booleans, strings, numbers, nodes, and node sets but no functionality for finding the nodes that match a given XPath expression within a document. Maybe TRAX can pick upnthe slack there?

Monday, June 18, 2001

The XML Apache Project has released Xerces-C 1.5.0 with partial support for the W3C XML Schema Language. However, this is not nearly as complete as the support in Xerces-Java Support is roughly as follows, according to the Xerces-C web page:

  • Partial Simple type support
    • Yes: atomic simple type
    • No: union and list
  • Partial Complex type support
    • Yes: choice, sequence
    • No: group, all
  • Element and Attribute Declaration
    • No: any/anyAttribute
  • SubsitutionGroup
  • Subset of Built-in Datatypes
    • Primitive Datatypes: string, boolean, decimal, hexbinary, base64binary
    • Derived Datatypes: integer
  • xsi Markup
    • Yes: xsi:nil
    • Yes: xsi:schemaLocation and xsi:noNamespaceSchemaLocation
    • No: xsi:type

Additional Experimental Features (not tested and subject to change, use as is)

  • Complex type derivation support (simpleContent and complexContent).
  • Element and attribute re-use using "ref".
  • Include support
  • Import Support

Other features in the Schema recommendation such as "redefine", "identity constraints" and others which are not mentioned above, are not supported yet. Also, particle and model group constraint checking is not yet fully implemented. But development is continuing and we target to implement all the features of the current XML Schema Recommendation before end of this year. Please note that the date is tentative and subject to change.

Saturday, June 16, 2001

The XML Apache Project has posted version 0.19 of FOP, an XSL Formatting Objects to PDF converter written in Java.

Thursday, June 14, 2001

Netscape's posted the first public beta of version 6.1 of its namesake web browser for Mac, Linux, and Windows. The main new feature in this release is allegedly much improved stability compared to the roundly panned Netscape 6.0.

Wednesday, June 13, 2001

RELAX Next Generation (RELAX NG) is an XML schema language being derived from combination of Murato Makoto's RELAX and James Clark's TREX. The OASIS RELAX NG committee has just published a RELAX NG tutorial. James Clark has released JING,a RELAX NG validator written in Java.

Tuesday, June 12, 2001

The W3C XML Query Working Group has released five new working drafts covering the XML Query Language:

Particularly notable is the unification of the XPath 2.0 and XQuery data models, and XQueryX a well-formed XML encoding of the more SQL like XQuery language.


Henry S. Thompson's released XSV 1.2, a W3C XML Schema Language validator. Version 1.2 adds partial support for RDDL. A standalone version is available via FTP.

Monday, June 11, 2001

The W3C Forms Working Group has posted a new working draft of XForms 1.0. XForms is a totally remodeled design for Web forms that work not just in traditional browsers but also in television sets, personal digital assistants, cell phones, and even paper. It depends on XML and schemas.

Friday, June 8, 2001

Mozilla 0.9.1 has been released for the usual trio of platforms (Linux, Windows, Mac). This release fixes a number of important bugs. Feature-wise this release is the first to support XSLT, "although the implementation is still incomplete and has bugs." In addition, the Modern skin has been overhauled with a lighter and smoother look with all new icons and widgets. The new status bar which combines the old taskbar and statusbar into one toolbar thus freeing up screen real-estate. IBM contributed bi-directional text support for Hebrew and Arabic, although Arabic shaping only works on Windows just now.


The W3C Document Object Model (DOM) Working Group has published a revised working draft of Document Object Model (DOM) Level 3 Abstract Schemas and Load and Save Specification. "Abstract Schema" is the new term for what was formerly known as "content model". It includes DTDs and W3C XML Schemas and potentially other schema languages as well.


The W3C has published a note about XML Linking and Style that discusses the integration of XSL with XLinks, among similar issues. The question of whether links are styles or semantics is one that's been causing strife inside W3C working groups for some time. Of course the real answer is "Yes". Style and semantics aren't nearly as cleanly separable in the real world as they are in Tim Berners-Lee's vision of the Web. This is just one manifestation of that problem.

Thursday, June 7, 2001

I've updated my DOM and JDOM XInclude processors to support the latest May 5 Working Draft of XInclude. This means you'll need to change you XInclude namespace URI to http://www.w3.org/2001/XInclude. The DOM version is working again, though you'll need to use Xerces 1.4 or later. Other parsers may work as well. Earlier versions of Xerces do not.

I think this is now an accurate implementation of the full XInclude specification except for XPointer support. This required changing the API to work primarily with lists and node lists rather then directly with elements and nodes. However, the API should not need to change again to support XPointer. Merging in XPointer support should be an internal implementation detail. The software processes my files successfully, but formal testing has been minimal so let me know if you spot any problems.

Wednesday, June 6, 2001

The W3C Synchornized Multimedia Working Group has elevated Synchronized Multimedia Integration Language (SMIL) 2.0 to a Proposed Recommendation. SMIL is an XML application for interactive multimedia presentations. It's suported by RealPlayer and various other tools. SMIL 2.0 adds a lot of features to SMIL 1.0, and divides the DTD into a modular framework. It also provides a schema for SMIL. Proposed Recommendation review ends July 5, 2001.


The W3C Document Object Model (DOM) Working Group has posted a revised working draft of DOM Level 3 Core. This describes the basic Element, Node, Attr, and other interfaces used in DOM. In this arena, DOM3 mostly fills a few holes in DOM2 without really changing very much.

Tuesday, June 5, 2001

Sun's posted the Java Architecture for XML Binding Working Draft (JAXB, previously known as Project Adelard) specification Version 0.21. I must admit this strikes me as a fairly pointless endeavor. I just can't figure out why I'd want to use this instead of SAX/DOM/JDOM/dom4j. But maybe I'll feel differently once I've had a chance to actually look at it. An early accecss implementation is available on the Java Developer Connection (registration required).

Monday, June 4, 2001

I've posted four chapters from the recently released second edition of the XML Bible here on Cafe con Leche:

These are the complete updated chapters that appear in the printed book. All have been significantly revised since the first edition two years ago. The XPointers chapter in particular has been updated from the revised first edition chapter I've had posted here for the last year or so, though all four are somewhat more current than what I've previously published. I also plan to publish one more chapter, Schemas, here in a month or two.

I used Upcast to convert the Word documents to XHTML, and then used XSLT to convert the XHTML into something readable. There may still be a few bugs in the process. The pictures are a little small and fuzzy, but should be mostly legible. I'll fix that when I get a minute. Otherwise, please let me know if you spot any formatting idiosyncracies.

I'm still looking for a good Word-to-XML/HTML converter. Upcast gets me part of the way there, but a very complex XSLT stylesheet was still needed to finish the job. I ruled out Logictran's RTFConverter even though it did a much better job than Upcast of producing readable HTML on my first pass because it lost all information about styles. My Word documents use styles very heavily to allow almost semantic markup of documents, and I can't afford to convert that into pure visual presentation.


In possibly related news, Adobe's posted a beta of Save As XML Plug-In for Macintosh for that adds five file convversions to Acrobat 5.0 :

  • XML without styling
  • HTML 4.01 with CSS Level
  • HTML 3.2 Accessible
  • HTML 3.20 without CSS
  • Text-only

This plug-in requires tagged PDF file, whatever that is. According to Adove, "A tagged Adobe PDF file can be generated from Microsoft Office 2000 applications, by using the Acrobat 5.0 Web Capture feature to convert a website to Adobe PDF, by using the Make Accessible Plug-In, or by sending PostScript+pdfmark through Acrobat Distiller. The structure which the creating program embeds in the PDF file will come back out in the XML or HTML generated by the Save As XML Plug-In."

Sunday, June 3, 2001

I've posted the main Web pages for the XML Bible, 2nd edition. Besides the usual "Buy this book" propaganda, you'll find the complete source code for all code listings in the book.

Saturday, June 2, 2001

IBM's alphaWorks has released XML Registry/Repository, a data management system that provides services for XML artifacts including DTDs, schemas, stylesheets, and instance documents. User can use XRR to obtain an XML artifact automatically, search or browse for an XML artifact, deposit an XML artifact with or without related data, and register an XML artifact without deposit. Users can search of registered objects based on their metadata. It's not totally clear from IBM's site, but this appears to be some sort of service running on IBM's servers, rather than a software product you download and install within your own organization, though apparently there is client software you need to use to access IBM's servers.

Friday, June 1, 2001

The W3C HTML Working Group has released the final recommendation of XHTML 1.1, Module Based XHTML. With just a couple of exceptions, the language described here is the same as HTML 1.0. However, the DTD structure is much more amenable to merging with other XML applications like Scalable Vector Graphics and MathML.


The Mozilla Project's released a new beta of Fizilla, a MacOS X port of the Mozilla web browser. I haven't tested it, but I assume it has roughly the same level of XML support as Mozilla 0.9.


Opera's posted a new beta of their eponymous Opera web browser for the Macintosh. I haven't tested it, but I assume it has roughly the same level of XML support as Opera 5 for Windows. Unicode and Java support are lacking though.


IBM's alphaWorks has posted a new release of XSLbyDemo for WebSphere Homepage Builder v6.0.3 or later. This release enhances XSLT rule generation capability and adds a dialog box for invoking an XSLT processor.

Wednesday, May 30, 2001

Representatives from a number of Linux distributions and major development projects are in the process of developing the XML/SGML addendum to the Linux Standards Base specification. A primary focus of the present discussion concerns the methods used by XML tools to locate resources, how these differ from methods used by current SGML tools, and the implications of the differing SGML/XML requirements regarding directory structures, configuration issues, etc. If you'd like to contribute to the discussion, please subscribe to the LSB-xml-sgml mailing list.

Tuesday, May 29, 2001

James Strachan's released dom4j 0.4, an open source Java library for processing XML. This release includes

  • A more flexible and powerful event notification mechanism
  • XML Schema Data Types support (alpha quality)
  • Performance optimizations
  • Assorted bug fixes

Microsoft's released an update to Internet Explorer 5.1.1 for Mac OS X. This browser supports direct display of XML documents with attached CSS style sheets. It is a fully Carbonized application.

Saturday, May 26, 2001
Cover of the XML Bible, 2nd Edition

I'm pleased to announce the release of the much anticipated second edition of the XML Bible. It's available now from Amazon, FatBrain, and other purveyors of computer books. This edition was extensively revised on almost every page. Every section and example was brought completely up to date with the state of the art in XML in 2001, and many new chapters were added. There are four completely new chapters in this edition covering:

  • Schemas
  • Scalable Vector Graphics (SVG)
  • The Wireless Markup Language (WML)
  • XHTML

Many other chapters were totally revised to bring them up to date with the latest version of various specifications, including:

  • XSLT
  • XSL Formatting Objects
  • XLinks
  • XPointers
  • Namespaces

I am now happy to say that this book is almost completely in-sync with all the latest XML specifications including the second edition of XML 1.0, XSLT 1.0, XPath 1.0, and the latest drafts of XLink and XPointer. (The one exception, unfortunately, are a few bugs in the later parts of the schemas chapter, particularly with regard to attribute declarations. That's the penalty for working on the bleeding edge. But even so, the vast majority of the examples in the schemas chapter are completely current with XML Schemas 1.0. I'll post the updates for the schemas material here soon.)

Perhaps most importantly, I rewrote almost every word, sentence, and paragraph to make the exposition clearer, the examples shorter, and the XML terminology more accurate. Mixed content finally gets its due. External DTDs are now emphasized in preference to internal DTDs, and everywhere I'm careful to distinguish between document type definitions and document type declarations. Mozilla and Opera are now used throughout the text as well as Internet Explorer. Since the first edition was released a year and a half ago, XML has gone from a bleeding edge technology to a solid. well-tested, and well-proved foundation of many of today's technologies. The second edition of the XML Bible reflects the increased maturity and stability of XML.

This is in every way a much better book than the first edition. If you liked the first edition, you'll love this edition. If you didn't like the first edition, you may find that the second edition fixes what bothered you about the first. The second edition is still only $49.99 even though it's more than 200 pages and 20% longer than the first edition. There is unfortunately no upgrade path, as is the case for most paper books. However, I will be posting some of the updated chapters here at Cafe con Leche in the near future, and the usual discounts do apply. Amazon and FatBrain are selling it for $39.99. Bookpool doesn't have it in stock yet but should get it soon. Amazon almost always sells out of my new books within a few hours of me announcing one here, but generally gets more instock much more quickly than they say they will. It will not take 4-6 weeks to get your copy. Many brick-and-mortar bookstores still have the first edition on the shelves. You can recognize the second edition by the robot on the cover. (A special prize goes out to the first person who successfully guesses just what that robot has to do with XML and the content of the book.) If you need to special order it, the title is XML Bible, 2nd Edition. The ISBN number is 0-7645-4760-7, and it's written by me, Elliotte Rusty Harold.

Thursday, May 24, 2001

The W3C CSS Working Group has posted a new working draft of the Introduction to CSS3. The changes appear to be quite minor.

Wednesday, May 23, 2001

The XML Apache XML Project has released version 2.10 of Xalan, an open source XSLT processor written in Java. This is the first Xalan release to include the XSLTC translet compiler and runtime, donated to the Xalan project by the Sun XSLTC team. A number of bugs are also fixed in this release.

Tuesday, May 22, 2001

The XML Apache Project has released version 1.4 of Xerces-J, an open source XML parser written in Java. The big new feature in this release is support for the final Recommendation of W3C XML Schemas. Previous versions only supported the out-of-date Candidate Recommendation from October of last year. There are still a few bugs in schema support, but all the big holes in schema support have been filled. The two remaining major issues are xsd:any and ID type elements.

Monday, May 21, 2001

The Apache Batik team has released Batik 1.0, an SVG viewer written in Java. Batik 1.0 supports most of the static SVG features, linking, and some scripting. New features since the last beta include:

  • Full filter effects support: lighting filters, drop shadows, displacement effects and more.
  • Full text support: Batik 1.0 supports embedded and external SVG Fonts, international text, vertical text and other fancy text features such as text on a path.
  • Full linking support: with SVG and Batik, you can navigate and link images just like you do with HTML pages.
  • Full support for structural elements such as internal and external use/defs and switch elements.
  • Performance improvements
  • An Improved SVG generator (an API for exporting SVG from Java technology applications) now offers more control over the output SVG
  • Improved SVG rasterizer (a tool to convert SVG images to raster formats such as JPEG, PNG or TIFF)
Sunday, May 20, 2001

There's a new hidden feature on Cafe con Leche and Cafe au Lait this month. If you view the source code for this page you'll notice that the individual dates all have id attributes like this:

<dt id="news2001May15">Tuesday, May 15, 2001</dt>

This means you can now link to the news from a particular date, at least in an HTML 4.0 compatible browser that supports id attributes as fragment identifiers. (Netscape 4 has problems with these.) I started adding these on May 8. Eventually I may try to figure out a regular expression that will let me add this to all the old news, but the format should be fairly consistent going forward:

id := "news" + 4digitYear + monthName + 2digitDate

Saturday, May 19, 2001

Version 0.4 of the eXist XML database has been posted. eXist provides pluggable storage backends, storing XML either in a relational database or using a native backend. eXist has it's own XPath implementation with full text support. Changes since version 0.3 include:

  • A new native backend which uses sequential files and B-trees for the indexes.
  • Supports PostgreSQL as the storage backend.
  • Enhanced the document object model to handle namespaces, comments and processing instructions.
  • Preserves entities
  • Bug fixes
Friday, May 18, 2001

The W3C XML Core Working Group has posted the last call working draft of XML Inclusions (XInclude) Version 1.0. XInclude allows you to build large XML documents out of smaller XML documents that are themselves well-formed and potentially valid. Changes in this release are fairly minor. The big one is that the namespace is now http://www.w3.org/2001/XInclude and the default prefix is xi (though as always the prefix can change as long as the namespace URI remains the same). The functionality and model seems essentially the same.


The W3C CSS Working Group has published two new modules from CSS Level 3:

  • The CSS3 text module extends CSS2 text properties to handle non-Western languages much better including bidirectional text and East Asian languages.
  • The CSS3 Media queries module defines selectors for attaching different styles depending on the capabilities of the output device. The proposed media features include width, height and color.
Thursday, May 17, 2001

The dbXML project has posted version 0.6 of the dbXML Core XML Database. dbXML Core is an open source native XML database designed to manage large collections of XML documents. The server supports XPath queries and uses an implementation of the XML:DB XML Database API for development of client applications. The source code has been released under the GNU Lesser General Public License. This release focuses on code cleanup, bug fixes and documentation.


The W3C XML Core Working Group has promoted the XML InfoSet to Candidate Recommendation. This specification "defines an abstract data set called the XML Information Set (Infoset). Its purpose is to provide a consistent set of definitions for use in other specifications that need to refer to the information in a well-formed XML document". There do not seem to be any functional changes in this draft. Comments are due by June 15.

Wednesday, May 16, 2001

The Institute for Applied Information Processing and Communications (IAIK) has released the first public beta of the IAIK XML Signature Library (IXSIL) 1.0. IXSIL is a toolkit that enables Java developers to create and verify XML digital signatures. IXSIL is based on the April 19, 2001 W3C/IETF Candidate Recommendation of XML-Signature Syntax and Processing.

Tuesday, May 15, 2001

James Strachan's posted version 0.3 of dom4j, an easy to use, open source Java library for processing XML with Java. This is primarily a bug fix release with some performance enhancements. New features include:

  • An OutputFormat.createPrettyPrint() method to create standard pretty printing of XML documents
  • A matrix-concat() XPath extension function
  • XML writing has been optimized.
  • The JAR file is smaller.
Monday, May 14, 2001

Version 0.3 of the XSLT Standard Library has been released. The XSLT Standard Library is a collection of commonly-used templates written purely in XSLT.


IBM's alphaWorks has updated their Web Services Toolkit to version 2.3. This release adds a private UDDI registry and enhancements to the WSDL Generation Tool to support COM. WSTK 2.3 is WSDL 1.1 spec compliant. SOAP encryption, UDDI4B (UDDI for Browser), and a Digital Signature handler are also included.

Sunday, May 13, 2001

The following warning comes from the USISPA, the US Internet Service Providers Alliance, and was forwarded to me by someone who generally seems to know what he's talking about. Please read, and then call your congressman. The bill itself is available on Thomas.

Date: Friday, May 11, 2001 6:43 AM
From: "Debra Sweezey" <dsweezey@yahoo.com>
Subject: Telecom Act Nixed -- Feds Hand Bells the Net

Telecom Act Nixed -- Feds Hand Bells the Net

This is very likely to be a headline you'll read in next week's paper. When it happens -- and it is happening -- ISPs and CLECs are history, and monopolies rule again. You have just a few weeks to change tomorrow's history, and you have the world's greatest tool for change right at your fingertips.

Post notices on news groups, use your mailing list, chat, whatever, as long as you CALL FOR ACTION NOW! There is federal legislation that threatens to hand the Bells the Net. It is called H.R. 1542, misleadingly named the Broadband Relief Act of 2001. It is authored by two very powerful Congressmen, House Commerce Chairman W.J. "Billy" Tauzin (R-LA) and Ranking Minority Member John Dingell (D-MI), and supported - surprise, surprise, by only four companies. Perhaps you've heard of them: SBC, Verizon, Bell South and Qwest.

In essence, the bill allows the RBOCs to immediately begin offering long-distance data service, and eliminates the Telecom Act's requirement that they lease parts of their networks -- including equipment used for high-speed Internet service -- to competitors. Out goes the competition, up go the prices.

Share this information with your friends, family, customers, colleagues today! Beg them to make their voices heard. Do they really want the local phone monopoly running the Internet? Do they want higher prices, lousy service, and no new technology?

USISPA, the US Internet Service Providers Alliance, is comprised of ISPs and ISP state association across the country, and we are working in the trenches trying to protect consumers and the Internet from the Bell monopolies. We need your help in letting Congress know that what they're trying to do, on the Bells' dime, is not okay with America.

We willing and ready to help you any way we can. Please call or email Debra Sweezey of the US Internet Service Providers Alliance, and we'll give you all the information you need. We will give you the contact information for your local representative in addition to sample statements or letters, or we're happy to contact your representative on your behalf. Whatever's easiest for you.

We hope you will join the fight for a free and open Internet. We need you!

This message comes to you through the United States Internet Service Providers Alliance (USISPA), an alliance of state ISP associations and ISP across the country working to ensure a fair and competitive open telecommunications environment nationwide. For more info or to find how you can help USISPA help you, email Debra Sweezey, Project Director for USISPA, at sweezey@usispa.org or call her at 202-326-0440.

=====
Debra Sweezey
Project Director, USISPA
202.326.0440
sweezey@usispa.org

Saturday, May 12, 2001

IBM's alphaWorks has posted a bug fix version of LotusXSL, an XSLT processor written in java. There are no new features in this release.

Wednesday, May 9, 2001

Fourthought has posted an alpha of 4EXSLT, an add-on module for the open source 4Suite that implements the entire initial collection of EXSLT functions and elements for the 4XSLT processor. EXSLT is an effort by the community of XSLT users and implementers to develop a common collection of XPath extension functions and XSLT extension elements that provide useful facilities beyond that provided by the official XSLT recommendation. The initial set of extensions includes tools for math, set manipulation, writing XPath extension functions in XSLT itself, multiple document output, conversion to node set and checking XPath object type.

Tuesday, May 8, 2001

Mozilla 0.9 has been released. New features in 0.9 include:

  • Automatic Proxy Configuration
  • Personal Security Manager 2.0 with improved performance and a new user interface.
  • MailNews front end has been overhauled with a huge performance improvement.
  • Browser and Mail now utilize a new cache and a new viewmanager for improved performance and correctness.
  • Java is only loaded when needed which dramatically decreases startup time and memory footprint.
  • New Help Viewer for Mozilla.
  • Long-click means of invoking contextual menus on one-button mouse Macs
  • Image rendering library was rewritten from scratch for increased preformance.

MacOS 8.6 or later, Windows 95 or later, or a reasoonably current Linux are required.

I've been using Mozilla 0.8.1 a lot lately, and found it very stable and useful. This is certainly the best XML browser on the market, bar none. Right now it supports XML, DOM2, CSS, and HTML 4. In the future I suspect it to get even stronger with native suport for real XSLT, MathML, SVG, XHTML, and more. It's gotten a bad rep because a lot of people confuse it with Netscape 6, a truly abominable product on the order of Microsoft Bob. However, although Mozilla and Netscape 6 are derived from the same code base, they are actually quite different browsers. Mozilla is far more stable and less spam-ridden than Netscape 6.


I've posted a few dozen new errata for XML in a Nutshell. Most of them are quite minor. All will be fixed in the third printing.

Monday, May 7, 2001

Michael Kay's released version 6.3 of SAXON, an XSLT processor written in Java. Version 6.3 implements the javax.xml.parsers package in JAXP 1.1 as well as the javax.xml.transform. Furthermore, Saxon now supports the EXSLT modules Common, Math, Sets, an Functions. The full list of extension functions is:

  • exslt:node-set()
  • exslt:object-type()
  • math:min()
  • math:max()
  • math:highest()
  • math:lowest()
  • set:difference()
  • set:intersection()
  • set:distinct()
  • set:leading()
  • set:trailing()
  • set:has-same-node()

plus two new elements:

  • func:function
  • func:result
Sunday, May 6, 2001

Jonathan Borden's written an an HTTP Extension Framework Namespace for RDDL that "might be useful when a client wishes the server to perform the RDDL indirection of a resource based on HTTP request headers describing the desired nature and/or purpose. For example a mod_rddl handler might be invoked in Apache, or a similar frame in Jigsaw when such a request header is present."


Sun's released the Sun XML Datatypes Library, a Java class library for validating strings against W3C XML Schema simple types and converting those strings into Java objects.

Saturday, May 5, 2001

IBM's alphaWorks has released Xeena for Schema, a syntax directed XML editor for editing schema-valid XML documents. This only supports a subset of the October 2000 Schema Candidate Recommendation. It does not support the XML Schema Language 1.0 released a few days ago.

Friday, May 4, 2001

Fourthought, Inc. has released version 0.11 of 4Suite, a collection of open source tool for processing XML in Python. It provides support for XML parsing, several transient and persistent DOM implementations, XInclude, XPath expressions, XPointer, XSLT transforms, XLink, RDF and ODMG object databases. CORBA is no longer required. There are many usability, documentation, performance and architectural improvements.

Fourthought, Inc. has also released version 0.11 of 4Suite Server, an open source XML data server. It features an XML data repository, a rules-based engine, and XSLT transforms, XPath and RDF-based indexing and query, and XLink resolution.


Thursday, May 3, 2001

Eric van der Vlist has posted XSLTunit 0.1, the first alpha release of an open source framework to do unit testing of XSLT templates.


Josh Lubell of the National Institute for Standards and Technology has released XSLToolbox, an open source collection of XSLT stylesheets that currently contains two tools:

  • APEX - an application for transforming XML documents as specified by architectural forms
  • ATTS - a stylesheet generator for adding default attribute values to XML documents

Henry S. Thompson and Oriol Carbo have initiated work on the W3C XML Schema Test Collection to coordinate "test suites for W3C XML Schema processors created by different developers." If you've got tests to contribute, send email to www-xml-schema-tests@w3.org.

Wednesday, May 2, 2001

Tim Berners-Lee has approved the W3C XML Schema Language as an official recommendation. It is available in three parts:

At first glance, there do not appear to be any significant changes since the last Proposed Recommendation.

Tuesday, May 1, 2001

I've updated my XInclude processor to version 1.0d4. This is written in Java and requires the current CVS version of JDOM. Today's release doesn't add a lot of functionality. About the only new feature is that comments and processing instructions that are outside the root element of an included document are now included. However, it makes major modifications to the API. The big difference is that the resolve() method now returns a List of nodes instead of an Object that is a node. This is necessary prerequisite for XPointer support because an XPointer may point to more than one node.

Update: I just noticed that my latest changes to XIncluder exposed a bug in JDOM. The effect is that included elements are included in front of the elements they're supposed to be after. I've submitted a patch for the JDOM bug. Assuming it's accepted and integrated, the new XIncluder should work as expected.

Monday, April 30, 2001

I've updated my XInclude processor. This is written in Java and requires the current CVS version of JDOM. This release fixes one major bug (the position of comments and processing instructions outside the root element is now maintained). Furthermore, I've cleaned up the code and rewritten a lot of the JavaDoc. Most importantly, this release is now more conformant with the Working Draft XInclude specification. In particular when it encounters an error with an included document, it throws an exception and gives up. It no longer puts an error message in the including document. (I still think this should be a user configurable option. Maybe in the next release.) The one major feature left to add is XPointer support. I've got a pretty good idea now how to do this, and hope to get to it soon.

I also fixed the DOM version. However, in the process I discovered some bugs in Xerces-J involving cloning documents. I have reported the bugs and am hopeful that they'll be fixed in Xerces-J 1.4. Meanwhile, my fixed version won't work at all until the bugs in Xerces get fixed, so I haven't uploaded it yet. I recommend you use the JDOM version instead.

Saturday, April 28, 2001

IBM's alphaWorks has released the XML Schema Quality Checker, a Java program that checks for problems in W3C XML Schemas.

Friday, April 27, 2001

IBM's alphaWorks has released the XML Security Suite, an experimental implementation of a proposal for the W3C XML Encryption spec. It allows you to encrypt and decrypt arbitrary binary data, XML elements, or the content of thean XML element. It's written in Java, and is officially supported on Windows and Linux but will probably run on other platforms.

Thursday, April 26, 2001

James Tauber and Daniel "eikeon" Krechogies have released Redfoot, an open source framework for distributed RDF-based applications, written in Python. It includes an RDF database, a query API for RDF with numerous higher-level query functions, an RDF parser and serializer, a simple HTTP server providing a Web interface for viewing and editing RDF, and the beginnings of a peer-to-peer architecture for communication between different RDF databases. The current version is 0.9.6.

Wednesday, April 25, 2001

The XML Encryption Working Group has published its first public working draft of XML Encryption Requirements. This document lists the design principles, scope, and requirements for a standard means of encrypting XML documents, and embedding encrypted documents (both XML and non-XML) in XML documents. The scope includes syntax, data model, format, cryptographic processing, patents and other intellectual property, algorithms, human factors issues, and interaction with other XML specifications such as the Document Object Model, XML Canonicalization and XML Digital Signatures.


Adobe's released Acrobat 5.0, its $250 payware portable document creation tool. Acrobat Reader 5.0 is free-beer. This version adds the capability to save the contents of PDF files into other formats such as RTF, TIFF, JPEG, and PNG. It supports 128-bit encrypted password protection and digital signatures, and can restrict editing and printing (though that's not too hard to break.) Acrobat forms can now be talk to databases through XML. Acrobat 5 also improves accessibility a bit by supporting high-contrast display settings and Windows-based screen readers, though it's still far less accessible than native XML or HTML with appropriate uise of style sheets.

Tuesday, April 24, 2001

Microsoft's posted "Technology Preview" (i.e. a beta) of MSXML 4.0, an XML parser/XSLT processor for Windows and Internet Explorer. This release features tentative support for the W3C XML Schema Language as well as experimental (and non-standard) integration of schemas with XPath and XSLT.

It is very difficult to get this version to replace IE's default MSXML 2.0 or a previously installed MSXML 3.0. The xmlinst program that could replace version 2.0 and 2.5 with version 3.0 doe snot work with version 4.0. Chris Bayes posted some instructions for hacking the registry to make IE use MSXML 4.0 on the xsl-list mailing list, though his post doesn't seem to be available in the archives yet.

Monday, April 23, 2001

ElCel Technology has released their Canonical XML Processor, a free-beer command-line tool for Windows and X86 Linux that converts XML documennts to canoncial form as described by the Canonical XML 1.0 Recommendation.

Saturday, April 21, 2001

Once again I'm chairing the XML track for Software Development 2001 East. This year the show has moved to Boston, and will take place from August 27-31. Besides the change in venue, this year also sees the introduction of the co-located Web Services World, which I'm not involved with; but which some of you might also be interested in submitting proposals for.

This is a hardcore developers show, but not a deep XML show. Attendees are looking for meaty, technical, how-to presentations on specific technologies like schemas, DTDs, SAX, DOM, JDOM, XLinks, XQuery, Schematron, RELAX, etc. Some things we are NOT looking for (at least in the XML track) are very broad analyses of business cases for XML, introductions to XML itself, very academic presentations on hypertext theory, or advanced seminars that assume attendees are already intimate with schemas, XLinks, XSLT, and so forth. Of course this is all relative to the knowledge level of the audience. At this point, I think we could probably justify an advanced DTDs talk that discussed modularization using parameter entity references and namespaces. However, we probably wouldn't accept the same talk if it used schemas instead of DTDs because few people in our audience know a lot about schemas yet. On the other hand, we would be interested in a basic intro to schemas.

In other words, this is a show geared towards working programmers, not a show geared toward academics, managers, or XML experts. We find our attendees come to this track to learn about XML, not because they already know a lot about XML and want to debate the finer arcana, so aim your talks at beginner and intermediate users, not the people who regularly post to this mailing list and write for XML.com. Ideally, we want a broad mix of technologies and presentations so attendees can walk in the first day of the show knowing nothing about XML, and walk out the last day having a solid grasp of what all the pieces of the XML puzzle are, how they fit together, and know enough to start using them.

We have approximately one dozen 90-minute slots to fill. We are also open to intermediate level, 1-day tutorials on special topics like XSLT or Schemas. For the most part, we'll be looking for experienced speakers to fill these. If this is your first time presenting at a conference like this, it would be better to start with a couple of seminars rather than a full-day session.

You can submit abstracts online. The deadline for abstracts is May 1, 2001. Please submit your abstracts as soon as possible. Abstracts which are selected will receive notification by email no later than May 15. Unfortunately, due to the volume of abstracts we receive, we cannot notify every submission regarding their status.

Friday, April 20, 2001

The W3C/IETF joint XML Digital Signature working group has released the second Candidate Recommendation of XML-Signature Syntax and Processing. This defines an algorithm for signing XML and other documents using public key encryuption schemes and hash codes, embedding those signatures in XML documents and embedding the signed documents in the digital signatures. From the draft:

This version contains many bug-fixes, clarifications, and improvements for DTD/schema extensibility and re-use. It reflects resolution of recent (and past) issues and the Schema Proposed Recommendation. As warned in the previous Candidate Recommendation, the minimal canonicalization algorithm has been removed because the Working Group could find no implementation. This specification is considered to be very stable. The W3C Namespace Policy requires that if a change in the namespace makes previously valid or compliant instances and implementations invalid, the namespace must also change. Since the clarifications do not substantively affect valid instance syntax or implemented features, the namespace has not been changed.

This phase is scheduled to end May 19, 2001.


The W3C Document Object Model (DOM) Working Group has revised two open working drafts for DOM3:


The W3C Voice Browser Working Group has published the first public working draft of Call Control Requirements in a Voice Browser Framework. According to the document itself, this "describes requirements for mechanisms that enable fine-grained control of speech (signal processing) resources and telephony resources in a VoiceXML telephony platform. The scope of these language features is for controlling resources in a platform on the network edge, not for building network-based call processing applications in a telephone switching system, or for controlling an entire telecom network."


IBM's alphaWorks has updated their P3P Policy Editor to support Java 1.3.0. This release is compatible with the Platform for Privacy Preferences Candidate Recommendation.

Thursday, April 19, 2001

The W3C XSLT Working Group has announced a decision to skip XSLT 1.1 and go straight to 2.0. In my opion, this is a mistake. Despite some initial controversy over language-specific extension function bindings, XSLT 1.1 offered a great opportunity to pick the low hanging fruit and clean up a few problems with XSLT 1.0 in a fairly quick fashion without compromising future development. XSLT 2.0 seems like a lot bigger project, and overall more risky than XSLT 1.1. I would have preferred to finish the easy improvements first without making them depend on much more complex things like XPath 2.0 and schema awareness.


Ipedo, Inc. is looking for private beta testers for the Ipedo XML Database, a native XML database with XSLT processing. Featured in this release include:

  • Support for SOAP and HTTP Servlet
  • Query through XPath's direct access to document
  • Integrated XML Transformation through XSLT
  • Persistent DOM allows access to the document object model after it has been loaded
  • Data pre-indexed for faster queries
  • Schema-based dynamic indexing

If you want to participate in the beta program, send email to Samantha Cichon, samantha@ipedo.com.

Wednesday, April 18, 2001

Adobe's released the final version of the Adobe SVG Viewer 2.0, a plug-in for Netscape and Internet Explorer on both Windows and the Mac that allows you to view Scalable Vector Graphics (SVG) pictures embedded in Web pages. In the near future, the Adobe SVG Viewer will ship with RealNetworks' Real Player. and Adobe Acrobat and Acrobat Reader, which shoudl be a big step toward jump-starting SVG adoption.


Opera Software has released version 5.10 of their namesake Web browser for Windows that supports direct display of XML documents with attached CSS style sheets. Opera imporves DOM and CSS support. Other new features include Flash, skins, a new progress display and stop button, gesture based mouse navigation, improved window handling, more configurable and accessible privacy preferences, and assorted bug fixes. Opera is $39 payware without ads or free-beer with ads, your choice.


ElCel Technology has released a command line XML Validator for Linux and Windows written in C++. Registration is required.


XTooX is an an open source XLink processor. that reads out-of-line extended XLinks and folds them into the documents that they are pointing to.


XML::Xalan 0.06, a Perl interface to the Xalan C++ XSLT processor, has been released.

Tuesday, April 17, 2001

I've heard occasional bad things about Sys-Con Media, the publishers of the Java Developer's Journal and the XML Journal among other magazines, for a few years now. For the most part I've ignored the rumors and whispers since I haven't had any particular relationship with or interest in them. However, last year I spoke at a couple of Sys-Con branded conferences, XML DevCon New York and San Jose, though these were actually run by Camelot Communications. At some point after that Sys-Con had a falling out with Camelot. A contract dispute broke about between Camelot and Sys-Con, various lawyers exchanged nasty letters, and legal proceedings were initiated, though these seem to be mandatory arbitration rather than actual lawsuits.

I wouldn't even know any of this was taking place except that in a rather unusual stunt someone decided to be very public about the dispute. Earlier this year an unidentified person sent a series of emails from a Hotmail account to speakers at various Camelot conferences slamming Camelot for alleged non-payment of money owed to Sys-Con. The email address used, camelotcomc@hotmail.com, was designed to make it appear at first glance that the communication came from Camelot Communications itself.

I mostly shrugged off the initial emails, and didn't pay them much attention. I have no idea whether Sys-Con's claims have merit, nor do I much care. Sadly this is often how business is done in the U.S. today. It's not really a surprise, and doesn't have much of an effect on independent contractors like me working for either company, at least not in the normal course of things. However, what did shock me and convince me that Sys-Con had crossed the line was an email I and many other people received last week as part of XML-J April Digital Edition, a free-subscription electronic newsletter about things XML. What specifically bothered me was the following "news item" that ran as the lead story:

TODAY's NEWS: XMLDEVCON 2001 OPENS TODAY IN NEW YORK WITH EMPTY CLASSROOMS AND DISAPPOINTING TURNOUT ON THE EXHIBIT FLOOR!

(April 9, 2001) - XMLDevCon 2001 opened today in New York, with empty classrooms, and a disappointing turnout on the exhibit floor. Organizers of XMLDevCon promoted the expected attendance of the show to be more than 4,000 delegates. However, one of the classrooms that XML-J visited this morning, showed approximately one dozen attendees. (See photo news..)
-> continued: http://www.sys-con.com/xml/

At the time I was on the other side of the country at the SDExpo West show; but when I returned I made some inquiries of a number of people who were at the show, and they all agreed that the story was grossly inaccurate. There were over 2,000 attendees at XML DevCon New York; and although this was fewer than attended last summer, it was still a good-sized crowd. As always happens at such shows some classes were fuller than others, and some were quite packed.

There really wasn't much of a story here in the first place. To the extent that there was a story, it was hardly a significant one. I doubt that the XML-J Digital Edition would even have covered it, much less covered it in the way they did, except for one thing: Sys-Con is currently involved in a dispute with Camelot. I can't help but believe that Sys-Con is deliberately slanting their editorial content to support their position in the dispute. Worse yet, they're doing it without any acknowledgement to the typical reader that they are in such a dispute. Most subscribers would have seen this as merely another news story, without realizing the biases that lay behind it.

I don't mind Sys-Con using their newsletter to get out their message. However, doing so in such a deceptive and misleading fashion really bothers me. It makes me question whether I can trust anything in a Sys-Con publication. If they give a product a good review, is it only because the manufacturer bought an ad? If they give a product a bad review, is it only because they have a dispute with the manufacturer? These are real issues in journalism, but for the most part there's a presumption of innocence. Sys-Con has lost that presumption. This story makes it clear that the business and legal end of the company trump the editorial.

Before this whole mess erupted, I was debating whether to attend or speak at the Sys-Con JavaEdge show in New York and the XMLEdge show in Santa Clara this fall. I was leaning toward going to JavaEdge since it's local, and skipping XMLEdge since it's not. But after this mess, my mind has been made up. I don't want anything to do with Sys-Con, their conferences, or their magazines. I won't speak at their conferences. I won't write for their magazines. Whatever the merits of their original claim against Camelot, the PR campaign they have pursued in support of it is bad enough. The mixing of that PR campaign with their editorial content is indefensible. This is not a publisher or company I want to be involved with in any way.

Monday, April 16, 2001

The W3C Internationalization Working Group has released the Proposed Recommendation of Ruby Annotation. Ruby are short runs of text alongside the base text, typically used in Chinese and Japanese documents to indicate pronunciation or to provide a short annotation. This specification defines markup for ruby as an XHTML module.

Sunday, April 15, 2001

ElfData's released XML Editor, a $70 payware XML editor (what else?) for the Mac that supports DTD validation. From the screenshots it looks like another tree-based editor. (When are developers going to realize that this is not the right user interface for an XML editor? Have any of them done any user testing? They all seem to be competing with each other based on how fast they can race in exactly the wrong direction. A good XML editor shoudl hide the structure of XML documents, not expose it.) XML Editor is $50 through April 25.

Saturday, April 14, 2001

Version 0.2.1 of the open source XSLT Standard Library has been released. This is a collection of commonly-used templates written in pure XSLT 1.0. Changes since v0.1 include:

  • In the String module, capitalise template now capitalises all words in a string. There's also a new substring-before-first template.
  • A Node module with xpath and type templates
  • An Example module that acts as a template for new modules.
  • More documentation

dom4j is yet another open source Java library for working with XML. The current version is 0.2. The main new features of this release are:

  • Improved XPath integration
  • Full JAXP, DOM, SAX and XSLT integration
  • Better I/O support
  • A SAXValidator for validating documents
  • A cleaner API

It looks interesting. The obvious comparison is with JDOM. It uses interfaces rather than classes (a minus for most developers) but does have Node and Text interfaces which JDOM does not, and which I need in my work. However, the parent relationshipo for nodes is only optional, which strikes me as a big minus. On the other hand, there's integrated XPath support which I consider a big plus. JDOM may get this in version 1.1. These are just the results of a cursory first glance at the API docs. A more detailed look will have to wait till a little time opens up. Still, open source competition is a good thing.


Revision 0.2 Beta 3 of xslide, an Emacs major mode for editing XSL stylesheets has been released. Features include:

  • XSL customization group for setting some variables
  • Initial stylesheet inserted into empty XSL buffers
  • "Template" menu for jumping to template rules, named templates, key declarations, and attribute-set declarations in the buffer
  • `xsl-process' function that runs an XSL processor and collects the output
  • Predefined command line templates and error regexps for Java and Windows executable versions of both XT and Saxon;
  • Font lock highlighting so that the important information stands out
  • `xsl-complete' function for inserting element and attribute names
  • `xsl-insert-tag' function for inserting matching start- and end-tags
  • Automatic completion of end-tags
  • Automatic indenting of elements with user-definable indentation step
  • Comprehensive abbreviations table to further ease typing.
Friday, April 13, 2001

The W3C DOM Working Group has posted a new public working draft of Document Object Model (DOM) Level 3 Events Specification. This specification defines a platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure and style of documents. DOM Events Level 3 builds on DOM Events Level 2. It's defined in IDL with bindings for Java and ECMAScript.

Thursday, April 12, 2001

I have a sudden need for a vector format (e.g. EPS) drawing of the Mozilla dinosaur. If anyone knows where I can find such a thing please let me know. Thanks!


The W3C HTML Working Group has released the official recommendation of Modularization of XHTML. This document "specifies an abstract modularization of XHTML and an implementation of the abstraction using XML Document Type Definitions (DTDs). This modularization provides a means for subsetting and extending XHTML, a feature needed for extending XHTML's reach onto emerging platforms." In other words, it lets you use create subsets of XHTML that leave out pieces you don't like such as tables or frames while adding new pieces you do need such as Scalable Vector Graphics (SVG) or MathML.

Wednesday, April 11, 2001

I've posted the notes from my talks at Software Development 2001 West this week, including:

I've given variations of these at many other conferences over the last year. The biggest change at this show was updating the schemas talk to cover the March 30 Proposed Recommendation of XML Schema.

Saturday, April 7, 2001

Sun's released version 1.1 of the Java API for XML Processing (JAXP) specification. This essentially describes DOM2 core, SAX2, TRAX, and a few factory classes for finding a parser and bootstrapping new documents. This may become a standard part of the Java class library in the near future.


The W3C HTML Working Group has advanced XHTML 1.1 - - Module-based XHTML to proposed recommendation. This specification defines a new XHTML document type based on the module framework and modules defined in Modularization of XHTML. XHTML 1.1 is intended as a "forward-looking document type cleanly separated from the deprecated, legacy functionality of HTML 4 [HTML4] that was brought forward into the XHTML 1.0 [XHTML1] document types." Changes since XHTML 1.0 strict include:

  • The xml:lang attribute replaces the lang attribute.
  • The name attribute of the a and map elements has been replaced by the id attribute.
  • The ruby collection of elements has been added.

The installed base of web browsers is not ready for these changes. HTML 3.2 and earlier is going to be with us for a long time to come.

Friday, April 6, 2001

I'm leaving tomorrow (Saturday) for the Software Development West show in San Jose. I should have at least occasional Internet access while I'm at the show, but updates may still be a little slow for the next week.


Mozilla 0.8.1 has been released for the Mac, Linux, and Windows. This browser supports XML, CSS, and simple XLinks. I plan to use it for most of my presentations next weeek at the SD West show. Mozilla is open source.


IBM has ported Mozilla to OS/2 under the name "IBM Web Browser". This port includes a spell checker and Flash plugin. While the browser is a fee based product, IBM has also released an open source version which is available at mozilla.org.


Windows and Linux betas of DocZilla have also been released. DocZilla is a component add-on for Mozilla based browsers which includes an XML and SGML Parser, a DTD Parser, and support for HyTime links and CALS tables.


Also this week, Netscape released Communicator 4.7.7; but this release has no real XML suport, just some assorted bug fixes.

Thursday, April 5, 2001

The W3C CSS Working Group has posted the first public working draft of Media Queries. According to the draft:

HTML4 and CSS2 currently support media-dependent style sheets tailored for different media types. For example, a document may use sans-serif fonts when displayed on a screen and serif fonts when printed. "Screen" and "print" are two of the media types that have been defined. To describe in more detail what type of devices a style sheet applies to, this document proposes media queries.

A media query consists of a media type and one or more expressions to limit the scope of a certain style sheet. Among the proposed media features that can be used in expressions are "width", "height", and "color". By using media queries, content presentations can be tailored to a range of devices without changing the content itself.


The W3C XForms Working Group has posted a new draft of the XForms Requirements. This specification describes requirements for the next generation of Web forms to replace today's HTML forms.


Henry S. Thompson of the University of Edinburgh has updated XSV schema validator to support the latest Proposed Recommendation version of the W3C XML Schema Language.

Thompson has also released XSU, an online XSLT-based tool which upgrades XML Schema documents from the 2000/10/XMLSchema to the 2001/XMLSchema namespace, implementing all the changes from the one to the other.


Jeni Tennison's announced a new website and mailing list for the EXSLT iniative. EXSLT is an open community initiative to standardise and document XSLT extension functions and elements. The extensions are broken down into a number of modules including Common, Math, Sets and Functions. One aim of EXSLT is to get the implementers of XSLT processors to standardise the functions that they make available, so that stylesheets can be more portable.

Wednesday, April 4, 2001

Alex "Achtung" has localized the FOP XSL-FO-to-PDF converter to support Russian. This is actually quite a big job, involving fixing somem bugs with Cyrillic encodings, adding a dozen embedded Cyrillic fonts, and a Russian hyphenation engine.


Version 0.1 of the XSLT Standard Library has been released. This is an open source (LGPL) collection of commonly-used templates written in pure XSLT 1.0. This initial release seeks "to promote the library, establish the engineering standards for the library and also acts as a Call For Participation. Anyone who has useful XSLT templates and feels that they may be of use to a wide range of XSLT developers and applications is invited to submit their templates for inclusion in the library." There are three mailing lists for the project:

  • xsltsl-users@lists.sourceforge.net: Discussion of the use of xsltsl.
  • xsltsl-devel@lists.sourceforge.net: Discussion of the development of xsltsl.
  • xsltsl-announce@lists.sourceforge.net: Project announcements.

XML Cooktop 2.200 is a free XML development environment for Windows that allows you to write and test style sheets, XML documents, DTDs, and XPATHs. Version 2.200 improves the installer, asks before overwriting file associations, and supports more XSLT processors.

Tuesday, April 3, 2001

Microsoft has released a service pack for their MSXML 3.0 XML parser/XSLT processor that fixes assorted bugs.


The W3C Schema Working Group has revised the XML Schema Language Proposed Recommendations. The only notable change in this draft is that the number type has reverted to its original name of decimal.


The Apache XML project has released version 0.18.1 of FOP, the open source XSL-FO-to-PDF translator written in Java. This release adds support for the start and end regions, fo:list-item, fo:table-header,

Monday, April 2, 2001

Microsoft and Verisign, two companies with deservedly poor reputations for designing and managing secure systems, as well as webMethods, have submitted a note to the W3C on a proposed XML Key Management Specification (XKMS). According to the abstract,

This document specifies protocols for distributing and registering public keys, suitable for use in conjunction with the proposed standard for XML Signature [XML-SIG] developed by the World Wide Web Consortium (W3C) and the Internet Engineering Task Force (IETF) and an anticipated companion standard for XML encryption. The XML Key Management Specification (XKMS) comprises two parts -- the XML Key Information Service Specification (X-KISS) and the XML Key Registration Service Specification (X-KRSS).

I haven't had time to read through the entire specification, but technical merit aside, I think it should be rejected on intellectual property (IP) grounds alone. The submitters seeem to want to make some unspecified IP claims on the technologies described in the spec, and are not willing to open up those technologies for free and unconstrained use by anyobne who wants to use them.


James Tauber's released PyTREX 0.7.0, at a clean-room implementation of Tree Regular Expressions for XML (TREX) written in Python. This is the first feature complete release.

Sunday, April 1, 2001

webMethods has written a schema for XSLT 1.0 in the W3C XML Schema Language Candidate Recommendation syntax.


IBM's alphaWorks has updated their XML Diff and Merge Tool with bug fixes and a more customizable comparison function. Element comparison works better when sibling nodes have different positions under their parent. Also, one can specify which attributes to look at when determining matching nodes.

Saturday, March 31, 2001

Unicode 3.1 has been released. The primary new feature of Unicode 3.1 is the addition of 44,946 new encoded characters. Together with the 49,194 already existing characters in Unicode 3.0, that comes to a grand total of 94,140 encoded characters in Unicode 3.1. The new characters cover several historic scripts, several sets of symbols, and a very large collection of additional CJK ideographs. Unicode 3.1 also features new character properties, and assignments of property values for the much expanded repertoire of characters.

The XML 1.0 specification anticipated this development and is ready to handle these characters. However many parsers, including all those written in Java, depend on APIs that are not Unicode 3.1 ready. For instance, any parser or API that uses a Java String to store XML text will fail when confronted with the new characters.

James Kass has posted Code2001, a freeware TrueType font covering some of the scripts in the new Plane 1, including Old Persian Cuneiform, Deseret, Tengwar, Cirth, Old Italic, and Gothic. It works in WordPad in Windows 2000, but apparently not yet in IE.


There's an interesting article from Clay Shirky in the September Business 2.0, entitled XML: No Magic Problem Solver that I missed when it was originally published, so I wanted to comment on it now. In essence, the article points out something I've been telling people for a long time: XML does not relieve you of the necessity to talk to the people you want to exchange data with to agree on the format you'll use to exchange it. However, Shirky gets one thing in this article very, very wrong. He actually claims that XML makes it harder to do this, not easier, and that is completely wrong. Specifically what he says is,

Sad XML Truth No. 1: Designing a good format using XML still requires human intelligence. The people selling XML as a tool that makes life easy are deluding their customers--good XML takes more work because it requires a rigorous description of the problem to be solved, and its much vaunted extensibility only works if the basic framework is sound.
 
Sad XML Truth No. 2: XML does not mean less pain. It does not remove the pain of having to describe your data; it simply front-loads the pain where it's easier to see and deal with. The payoff only comes if XML is rolled out carefully enough at the start to lessen day-to-day difficulties once the system is up and running. Businesses that use XML thoughtlessly will face all of the upfront trouble of implementing XML, plus all of the day-to-day annoyances that result from improperly described data.

In fact, XML does mean less pain and it does require less work. Shirky's mistake is assuming that we could somehow live without rigorous descriptions and sound frameworks before XML existed. In fact, they were just as necessary then as now. It's just that XML makes it a lot easier to build rigorous descriptions and sound frameworks than when you had to start from scrtach with ASCII or (worse yet) raw bytes.

The reason XML is so widely used is that the designers of XML did a wonderful job of solving exactly those data exchange problems that nobody cared about and only those problems that nobody cared about. Let me explain. No one really cares whether lines end with a carriage return or a line feed or both. No one really cares whether fields are separated by a tab or a comma. No one really cares whether Unicode is encoded in UTF-8 or UTF-16. All the common alternatives are equally good. Prior to XML, however, before you could exchange data with someone, you needed to agree on answers to these and many similar questions, none of which had anything to do with the business case for exchanging the data. Furthermore, whatever underlying format you eventually agreed on (tab-delimited UTF-16 with carriage return line feed pairs separating lines or something else), you then had to write your own tools to process that data.

Once XML enters the picture, however, all these questions are answered. You don't care what line separator is used. You don't care what encoding is used. You agree that fields are separated by tags. And you use any of a variety of free tools to parse the data. You still need to spend time talking to the organizations you're exchanging data with to agree on what goes in the fields, what the fields are named, and what the data means; but you had to do all that anyway before XML, and XML doesn't make doing it any harder. In fact XML makes it all considerably easier. As soon as you've said, "The underlying format of our application is XML," you've answered a lot of questions that must be answered although noone really cares what the answer is. You can then move on to questions like "What is the valid range of prices?" which are much more important to the business. XML isn't a magic problem solver for all the problems of information exchange, but it pretty magically solves 10-20% of any given problem, and that's a significant improvement.

Friday, March 30, 2001

The IETF XML Digital Signatures Working Group has released RFC 3075 on XML-Signature Syntax and Processing. XML Signatures provide integrity, message authentication, and/or signer authentication services for data of any type, whether located within the XML that includes the signature or elsewhere. This is now a Proposed Standard Protocol. It is being jointly developed by the IETF and the W3C. Presumably the W3C version of this draft will be released soon. (The last W3C draft was a candidate recommendation in October of last year.)

Thursday, March 29, 2001

The W3C XML Protocol Working Group has posted the second public working draft of XML Protocol (XMLP) Requirements. For those who haven't been following this, XML-RPC begat SOAP 1.0 which begat SOAP 1.1 which is currently pregnant with XML Protocol. Each successive generation adds some complexity while filling numerous holes and gaps in earlier generations. However, they all do more or less the same thing: allow you to make a method call to a remote system by sending an XML document over HTTP.

I'm at the O'Reilly Java conference right now, and XML Protocol seems to be the sleeper subject; that is, the topic that's not actually discussed in any sessions but is whispered about a lot in the hallways and at lunch, by people who aren't quite sure what to make of it, but suspect it's going to be very important.

During the course of this discussion, I had a revelation about XML-RPC/SOAP/XML Protocol: These are all nothing like remote procedure calls. These technologies have very little if anything to do with RPC/RMI/CORBA/DCOM/etc. They are neither players nor competitors in that space. The name of the first iteration of this technology, XML-RPC, was misleading in that respect. Instead, what this really is, is CGI POST. However, instead of sending the web server an x-www-form-urlencoded query string, you send the server an XML document. This is not a simplification of RPC. It's a complexification of CGI! I even hinted at this in Java Network Programming when I noted that there was no particular reason the body of a POST request had to contain an x-www-form-urlencoded query string, and that it indeed could contain something else as long as you controlled both the client and the server. I didn't follow up on that then or imagine that the body might be an XML document; but now that I have realized that, SOAP et al. make a lot more sense, and seem a lot easier to understand and explain.


The IETF/W3C XML Signature Working Group has released Canonical XML 1.0. Canonical XML defines an algorithm for converting an XML document to a sequence of bytes in such a fashion that two documents with the same byte-for-byte canonical form may considered to be in some sense the same. The canonical form normalizes a number of insignificant details like attribute order, whether single or double quotes are used to surround white space, and so forth. Changes in the final release of this specification are primarily editorial. There's only one very minor substantive change dealing with the Unicode decomposition of the Hebrew letter YOD with HIRIQ.


The W3C XML Core Working Group has published a second last call working draft of the XML Infoset. The Infoset tries to "provide a consistent set of definitions for use in other specifications that need to refer to the information in a well-formed XML document". The big changes in this draft are the elimination of

  • Internal Entity Information Items
  • External Entity Information Items
  • Entity Start Marker Information Items
  • Entity End Marker Information Items
  • CDATA Start Marker Information Items
  • CDATA End Marker Information Items

You now just see the post-parsed character data with no indication of how any individual character was encoded in the document.


The W3C HTML Working Group has published the first public working draft of Modularization of XHTML in XML Schema. This provides a schema for XHTML 1.1.

Wednesday, March 28, 2001

IBM developerWorks has posted an MP3 of an XML and Web Services panel I participated in at the International Conference for Java Development in New York a few weeks ago. You can listen to me dis WML. Other panelists included Alex Chaffee, David Megginson, Gerry Seidman, and Jay Walters.


Wolfgang Meier's released eXist, a "repository and retrieval engine for XML documents build on top of an relational database". Currently exist supports MySQL and Oracle. eXist provides structured retrieval of arbitrary XML documents with support for fulltext search. This release (0.3) adds Oracle support, speed-ups, a redesigned XPath parser, increased thread safety, and connection-pooling.

Tuesday, March 27, 2001

I've posted the updated notes from my tutorials yesterday at the O'Reilly Conference on Java:

These are essentially the same talks I gave a couple of weeks ago at XMLOne in Austin.


The W3C XML Schema Working Group has published the first public working draft of XML Schema: Formal Description. According to their draft,

This is a formal, declarative system for describing and naming XML Schema information, specifying XML instance type information, and validating instances against schemas. The goals of the formalization are to:

  • Provide a semantic framework for software systems that use the W3C XML Schema specification, such as the W3C XML Query Algebra.
  • Specify names for all components of an XML Schema, so that they can be uniquely identified by URIs. Such unique identifiers may be useful to XML Query, RDF, and topic maps, among others.
  • Formally define validation at a declarative level.
  • Define the mapping from the current XML Schema syntax onto the structures described here, as well as the mapping between the XML Schema component mode and our component model.

Version 0.6.5 of the Python/XML distribution has been released. This is more or less a beta release. The Python/XML distribution contains the basic tools required for processing XML data using Python. The distribution includes parsers and standard interfaces such as SAX and DOM, along with various other useful modules. Python/XML currently contains:

  • Jack Jansen's Pyexpat XML parser
  • Lars Marius Garshol's xmlproc XML parser
  • Fredrik Lundh's sgmlop XML parser
  • Lars Marius Garshol's SAX interface
  • Paul Prescod's minidom DOM implementation
  • Fourthought's 4DOM
  • Various utility modules and functions
  • Documentation and example programs

Changes in this version include:

  • setup supports two command line options, --with-libexpat and --ldflags to specify an alternative expat installation
  • A new xml.utils.boolean type distinguishes boolean from integer values.
  • xmlproc_parse and xmlproc_val scripts porvide a command-line interface for xmlproc
  • WDDX marshalling now supports "strict" and "loose" modes of operation.
  • minidom supports the DocumentFragment interface, and correctly sets the ownerDocument property.
  • A SAX exception now retrieves line number information when it is created, not when it is printed.
  • Invoking sax2lib.ValidatingReaderFactory.make_parser creates a reader object that is already set to validating mode.
  • A number of callback errors in the SAX2 xmlproc driver have been corrected.

James Tauber's released PyTREX 0.6.0, at a clean-room implementation of Tree Regular Expressions for XML (TREX) written in Python.


Robert C. Lyons has invented the Turing Machine Markup Language (TMML), an XML application for describing programs for Turing machines. He's also written a TMML interpreter in XSLT that executes Turing machine programs described in TMML documents, thus proving that XSLT is indeed Turing complete.

Sunday, March 25, 2001

The Apache XML Project has released Xerces-J 1.3.1, an open source XML parser written in Java that supports DOM2, SAX2, and JAXP. This release supports more of the Candidate Recommendation of the W3C XML Schema Language, but does not yet suport the more recent Proposed Recommendation. A few bugs were also fixed.

The Apache XML Project has also released Xalan-Java 2.0.1, an open source XSLT processor written in Java. This adds several new new command-line parameters, two new streamlined sample servlets, and fixes assorted bugs. It supports XSLT 1.0.

Saturday, March 24, 2001

I'm back from XMLOne London. While there I talked about schemas and XLinks and XPointers. I'll have the revised notes up a little later today. The schemas talk in particular featured a lot of new material I hadn't previously talked about including facets and the changes in the March 16 Proposed Recommendations.

I apologize to those of you who wrote in to say you were worried about my absence. I expected the conference to have much better Internet connectivity than it actually did. In fact, without a local London dialup there was effectively no way for me to update my web sites. I could check my email, but only barely. I was very disappointed that the conference did not provide a place for me to plug my laptop into so I could work while at the conference. In the future, I'm going to be cutting down on the conference appearances I make, and one criterion for choosing which ones to attend is going to be the Internet access they provide.

Tomorrow I leave for the O'Reilly Java Conference in Santa Clara. Last year the O'Reilly conference had much better Internet connectivity than any other conference I've ever been at, so I should be able to maintain Cafe con Leche and Cafe au Lait while there.


IBM's alphaWorks has released XSLerator, a tool written in Java to generate XSLT style sheets from mappings defined using a visual interface. XSLerator supports mappings with extended conversion functions including iterations, conditions, joins, variables, and XPATH functions. Some knowledge of XSLT is required.

Wednesday, March 21, 2001

The W3C Schema Working Group has elevated their XML Schema Language to Proposed Recommendation. Changes since the candidate recommendation are fairly minor and include:

  • uriReference is now anyURI
  • binary has been replaced by hexBinary and base64Binary
  • Most of the date types like year are now prefixed with a g as in gYear
  • The namespace URIs are now http://www.w3.org/2001/XMLSchema and http://www.w3.org/2001/XMLSchema-instance
Friday, March 16, 2001

IBM's submitted a note with no official status to the W3C requesting that Unicode C1 control character character #x85 NEL (new line) be redefined as white space and a line ending character in XML documents. It is currently allowed in XML documents but cannot appear in tags and a few other places carriage retruns and line feeds can appear. This is apparently causing problems for XML parsers on OS/390 where ASCII FTP transfers change carriage returns and linefeeds to a NEL.

This is a horrible idea, and I hope the W3C rejects it firmly. Adopting it would break essentially every XML parser on the planet. The proper fix for this problem is for IBM to start using real UTF-8 and ASCII on their systems, rather than asking the rest of the world to adapt to their brain damaged, non-standard software. Software that transfers XML files to OS/390 (or any other) systems should not be rewriting it without an understanding of the syntax of XML documents. Translations that treat XML documents as plain text files are bound to cause problems. Any software that's not XML-aware should not attempt to change XML documents.

Thursday, March 15, 2001

The W3C Device Independence Working Group has published a last call working draft of Composite Capability/Preference Profiles (CC/PP): Structure and Vocabularies. This is an RDF vocabulary for describing user agent (browser) and proxy capabilities and preferences. Topics include:

  • The structure of client capability and preference descriptions
  • The structure of proxy behavior description
  • The use of RDF classes to distinguish different elements of a profile, so that a schema-aware RDF processor can handle CC/PP profiles embedded in other XML document types.

The CC/PP vocabulary uses URIs to refer to specific capabilities and preferences. It covers:

  • The types of values to which CC/PP attributes may refer
  • How to introduce new vocabularies
  • A client vocabulary covering print and display capabilities
  • A survey of existing work from which new vocabularies may be derived.
Wednesday, March 14, 2001

IBM's alphaWorks has updated their XML Diff and Merge Tool to support Java 1.3, fix a few bugs, and add support for id and name attributes in merely well-formed documents that don't have DTDs and thus can't declare attributes to have ID type.

Tuesday, March 13, 2001

I've posted the complete examples from the upcoming second edition of the XML Bible. As well as updating all the old materials, new chapters in this edition cover schemas, SVG, XHTML, and WML. The finsihed book is in page proofs now, and should be on store shelves in a couple of months.

Monday, March 12, 2001

One of the problems with attending six conferences in a little over six weeks is that it becomes very difficult to keep up with the more complex developments in the XML world. While I was busy pouring over the XPath 2.0, XSLT 2.0, and XQuery 2.0 working drafts so I could discuss them in last week's Cutting Edge XML Programming session, several other people were busy trying to extend XSLT 1.0 in a different direction to allow extension functions to be written in pure XSLT. I haven't fully digested all this work yet, but it looks interesting.

The first such effort is Jeni Tennison's EXSL, User-Defined Extension Functions in XSLT. EXSL defines extension elements and functions to support user definition of extension functions using XSLT. Uche Ogbuji's written an implementation for 4XSLT. The current version is 0.1.

David Rosenborg of Pantor Engineering AB has counter-proposed Functional XPath (FXPath). He's already produced a sample implementation for Michael Kay's SAXON. The current version is 0.3.

Sunday, March 11, 2001

The W3C CSS Working Group has published the first public working draft of CSS3 module: Color. This draft brings together in one place the color parts of previous W3C Recommendations including HTML 4 and Cascading Style Sheets (CSS) levels 1 and 2. It also adds several new CSS3 properties for opacity, ICC color profiles, and rendering intent of image content.


The same working group has also updated the Syntax of CSS rules in HTML's 'style' attribute working draft. This document describes the history, grammar, cascading order and profiles for CSS fragments in the style attribute.

Saturday, March 10, 2001

I've posted the updated notes from all my talks at XMLOne in Austin last week. These include:

  • XML Fundamentals, a three hour introduction to XML and related technologies

  • Processing XML with SAX and DOM, a three hour introduction to writing Java programs to manipulate XML documents

  • XML Hypertext, a two hour discussion of XLinks, XPointers, XML base, and XInclude

  • XSLT, a 90-minute introduction to writing Java programs to manipulate XML documents

  • Cutting Edge XML Programming, a three hour seminar covering more or less whatever seems to be newest and hottest on the day I give it. This time I covered the XML Infoset, DOM Level 3, and XQuery.

Friday, March 9, 2001

O'Reilly's stopping development of the WebSite web server for Windows, as well as the companion WebBoard product. They're looking for buyers for these products, but personally I don't see that they have much place in a world dominated by the open source Apache web server, which O'Reilly's done not a little to promote and even uses on many of its own web sites. Next target for Apache: Microsoft's IIS.

Meanwhile, here's a suggestion for Tim: If you can't make a go of WebSite and WebBoard with a payware business model, why not open source them? I doubt WebSite could compete effectively with Apache, even as open source, but there are a lot of sites (like mine) that would be very interested in an open source WebBoard.


Johannes Döbler has released jd.xslt, an open source XSLT 1.1 processor written in Java.

Wednesday, March 7, 2001

James Tauber's posted the first alpha of PyTREX, is a Python implementation of James Clark's Tree Regular Expressions for XML (TREX) schema language. This release is numbered 0.5.0.

Monday, March 5, 2001

Jonathan Borden's updated the draft specification for the Resource Description Discovery Language (RDDL, pronounced "riddle"). According to Borden,

This version contains only relatively minor edits and clarifies the URL persistence policy. The URLs for the well known natures and purposes were a mix of http://www.rddl.org/natures and http://www.rddl.org/natures.html. Now only http://www.rddl.org/natures and http://www.rddl.org/purposes are specified and should be used as these will be subdivided in subdirectories in the future.

He's also posted an RDDL directory of other RDDL documents. Finally, Borden's also begun work on a RDDLClassLoader for Java. This extends java.net.URLClassLoader. The constructor reads an RDDL directory to get a list of Java CLASSPATH URLs having a specified purpose such as "xslt-extension".

Sunday, March 4, 2001

The W3C Synchronized Multimedia Working Group has posted a new working draft of the Synchronized Multimedia Integration Language (SMIL 2.0) Specification. SMIL 2.0 has two goals:

  • Define an XML-based language that allows authors to write interactive multimedia presentations. Using SMIL 2.0, an author can describe the temporal behavior of a multimedia presentation, associate hyperlinks with media objects and describe the layout of the presentation on a screen.
  • Allow SMIL to be used in other XML-based languages such as XHTML and SVG.

SMIL 2.0 is modularized, much as XHTML 1.1 is. One of the big changes additions to this release of the spec is a full complement of modules defined in XML Schemas, rather than just the DTD modules of previous working drafts. To the best of my knowledge, this is the first non-schema spec to make real use of the W3C XML Schema Language.


I'm leaving today for the XMLOne show in Austin. Consequently, updates may be a little spotty here until I return on Friday. However, assuming I can get reasonable net access at the show, I'll try to keep things fresh here.

Saturday, March 3, 2001

The Apache XML Project has released version 1.1 of Xalan-C++, an XSLT processor written in C++. This release offers a "greatly simplified C++ API and C API for performing standard XSL transformations" as well as fixing bugs and upgrading performance. Binaries are available for Windows, Red Hat Linux 6.1 , AIX 4.3, HP-UX 11, and Solaris 2.6.

Friday, March 2, 2001

The Unicode Consortium has revised the draft technical report covering Unicode 3.1. The main change in Unicode 3.1 are many new characters, including characters with code points greater than 65,535 for the first time. Unicode will no longer effectively be a two-byte character set.

Thursday, March 1, 2001

Michael Kay's released version 6.2.1 of the open source SAXON XSLT processor written in Java. SAXON supports all of XSLT 1.0 and features preliminary support for parts of XSLT 1.1 as well as numerous extension functions. This release fixes a number of bugs.


RenderX has released version 2.2.1 of the payware XEP XSL-FO-to-PDF converter. New features in this release include:

  • Running headers via fo:marker and fo:retrieve-marker
  • Top-floats (fo:float[@float="before"])
  • xsl-footnote-separator
  • Backgrounds in tables;
  • Improved support for fo:leader
  • Hyphenation
  • Less memory needed
  • Assorted bug fixes.

An evaluation version witha 10-page limit is available..

Wednesday, February 28, 2001

The W3C Platform for Privacy Preferences (P3P) Preference Interchange Language Working Group has published a new working drafts on A P3P Preference Exchange Language 1.0 (APPEL1.0). APPEL is a "a language for describing collections of preferences regarding P3P policies between P3P agents. Using this language, a user can express her preferences in a set of preference-rules (called a ruleset), which can then be used by her user agent to make automated or semi-automated decisions regarding the acceptability of machine-readable privacy policies from P3P enabled Web sites." APPEL is expressed in XML.


The W3C CC/PP Working Group has published a revised working draft of Composite Capability/Preference Profiles (CC/PP): Structure and Vocabularies. A CC/PP profile is a description of device capabilities and user preferences that can be used to tailor content for that device. CC/PP documents are written in RDF.

Tuesday, February 27, 2001

Fourthought, Inc. has released version 0.10.2 of 4Suite, a collection of open source tool for processing XML in Python. It provides support for XML parsing, several transient and persistent DOM implementations, XPath expressions, XPointer, XSLT transforms, XLink, RDF and ODMG object databases. This is mostly a bug-fix and general clean-up release. There are new search-re() and base-uri() XSLT extension functions.

Fourthought, Inc. has also released version 0.10.2 of 4Suite Server, an open source XML data server. It features an XML data repository, a rules-based engine, and XSLT transforms, XPath and RDF-based indexing and query, and XLink resolution. Again this release focuses mostly on bug fixes and code and documentation clean-up as well as usability improvements and more test cases.


The O'Reilly Network's published my latest article, RDDL Me This: What Does a Namespace URL Locate?. This article explores the Resource Description and Discovery Language.

Monday, February 26, 2001

The W3C has released Mathematical Markup Language 2.0 (MathML) as an official recommendation. Changes since MathML 1.0.1 include:

  • XML namespace support

  • New mathematics style attributes on token elements: mathvariant, mathsize, mathweight, and mathcolor

  • New presentation elements: mglyph, menclose and mlabeledtr

  • New content elements: domain, codomain, image, domainofapplication, arg, real, imaginary, lcm, floor, ceiling, equivalent, approx, divergence, grad, curl, laplacian, card, cartesianproduct, momentabout, vectorproduct, scalarproduct, outerproduct, integers, reals, rationals, naturalnumbers, complexes, primes, exponentiale, imaginaryi, notanumber, true, false, emptyset, pi, eulergamma, infinity, piecewise, piece and otherwise

Browser support is still lacking.


The W3C HTML Working Group has published the Proposed Recommendation of Modularization of XHTML. In my opinion, this is simply far too early for a proposed recommendation because a very key component, the schema implementations, is still completely omitted. This specification needs to be held back until schemas are finished. Review ends 22 March 22, 2001.


During last week's XLinks and Namespaces seminars at XML DevCon, I gave my usual spiel about how "All the specifications are defined in terms of URIs, but only URLs are used in practice." Eventually, however, this will change with the advent of Uniform resource Names (URNs). URNs depend on the concept of individual URN namespaces (not the same as XML namespaces!). The IETF URN Working Group has posted a new Internet draft on URN Namespace Definition Mechanisms defining URN namespaces and the mechanisms for establishing new ones.


IBM's alphaWorks has released XML for C++ 3.3.1, which is based on the Apache Xerces XML C++ Parser v1.3.0. This parser supports DOM2 and SAX2. This release adds bug fixes, speed ups, and AS400 iSeries binaries.

Sunday, February 25, 2001

I'm back from XMLDevCon London 2001. where a good time was had by all. I've posted the notes from my talks on XLinks and Namespaces. Tomorrow I'll be catching up with all the news that piled up while I was away.

Monday, February 19, 2001

XQuery, the W3C XML Query Language, is a functional language for extracting information from XML documents and databases. Think of it as XPath crossed with SQL. The W3C XML Query Working Group has just published five documents describing XQuery:

According to Jonathan Robie, "The first three documents should be reasonably accessible. The last two are more mathematical." I hope to talk about all of these in my Cutting Edge XML Programming talk at XMLOne Austin next month. I'll certainly cover them in my Advanced XML tutorial at SDExpo West in April.

Sunday, February 18, 2001

The Gnome Project has posted libxslt-0.2.0, the second beta release, of this XSLT processor library for Linux written in C. It supports most of XSLT 1.0 except for

  • Extension elements and functions
  • Embedded stylesheets
  • document()
  • key()

XSLT-process 1.2 has been released. XSLT-process is an Emacs minor mode that allows you to invoke an XSLT processor of choice on a buffer, displaying the result in an additional buffer. Currently supported XSLT processors include Xalan 1.x, and Saxon 5.x and 6.x. Changes since 1.1 include:

  • Support for the TrAX interface with stylesheet caching
  • Compatible with GNU Emacs on Windows NT/2000
  • Changed the keyboard binding to "C-c C-x C-v"
Saturday, February 17, 2001

The W3C Forms Working Group has published a new working draft of XForms 1.0. XForms are a proposal for the next generation of HTML forms, with much broader support for platforms of varying capabilities such as desktop computers, television sets, personal digital assistants, cell phones, computer peripherals and even paper. XForms can collect information that meets the constraints of various schema data types including:

  • String
  • Boolean
  • Number
  • Date
  • Time
  • Duration
  • URI
  • Binary

In adddition, it derives Currency and Monetary types from String. These can be further constrained with facets as in schemas. Another interesting and useful feature of XForms is a dynamic constraints language that enables you to define integrity constraints that act over multiple fields. For instance, "the total value of an order can be defined in terms of a computation over other values such as unit prices, quantities, discounts, and tax and shipping costs." The language used to do this is XPath. Time permitting, I'm going to try to whip up a few notes on this for my bleeding edge presentations at either XMLOne in Austin or SDExpo West in San Jose.


In a very unusual move, Sun Microsystems is seeking a patent office reexamination of one of their own patents, U.S. Patent No. 5,659,729, that may (or may not) affect XPointer. Prior art should be sent to 729-patent@east.sun.com for use in the reexamination and possible revocation of the patent. They are also seeking suggested changes to the licensing terms they've proposed for this patent. It sounds like Sun really wants to do the right thing here, so if you have any prior art, please help them out.

Friday, February 16, 2001

The W3C XSL Working group has published working drafts of the requirements for XPath 2.0 and XSLT 2.0. Suggested requirements for XPath 2.0 include:

  • Must express data model in terms of the Infoset

  • Must provide common core syntax and semantics for XSLT 2.0 and XML Query 1.0

  • Must support explicit "for any" or "for all" comparison and equality semantics

  • Must add min() and max() functions

  • Any valid XPath 1.0 expression SHOULD also be a valid XPath 2.0 expression semantics when operating in the absence of XML Schema type information.

  • Should provide intersection and difference functions

  • Must loosen restrictions on location steps

  • Must provide a conditional expression (e.g. ternary ?: operator in Java and C)

  • Should support additional string functions, possibly including space padding, string replacement and conversion to upper or lower case

  • Must support string matching using regular expressions using the regexp syntax from schemas

  • Must add support for XML Schema primitive datatypes

  • Should add support for XML Schema: Structures

Suggested requirements for XSLT 2.0 include:

  • Must maintain backwards compatibility with XSLT 1.1

  • A stylesheet Should be able to match elements and attributes whose value is explicitly null.

  • Should allow included documents to "Encapsulate" local stylesheets

  • Could support accessing infoset items for XML declaration

  • Could provide qualified name aware string functions

  • Could enable constructing a namespace with computed name

  • Could simplify resolving prefix conflicts in qname-valued attributes

  • Could support XHTML output method

  • Must allow matching on default namespace without explicit prefix

  • Must add date formatting functions

  • Must simplify accessing IDs and keys in other documents

  • Should provide function to absolutize relative uris

  • Should include unparsed text from an external resource

  • Should allow authoring extension functions in XSLT

  • Should output character entity references instead of numeric character entities

  • Should construct entity reference by name

  • Should support Unicode string normalization

  • Should standardize extension element language bindings

  • Could improve efficiency of transformations on large documents

  • Could support reverse IDREF attributes

  • Could support case-insensitive comparisons

  • Could support lexigraphic string comparisons

  • Could allow comparing nodes based on document order

  • Could improve support unparsed entities

  • Could allow processing a node with the "next best matching" template

  • Could make coercions symmetric by allowing scalar to nodeset conversion

  • Must support XML schema

  • Must simplify constructing and copying typed content

  • Must support sorting nodes based on XML schema type

  • Could support scientific notation in number formatting

  • Could provide ability to detect whether "rich" schema information is available

  • Must simplify grouping

In addition, the following two are explicitly not goals:

  • Simplifying the ability to parse unstructured information to produce structured results.

  • Turning XSLT into a general-purpose programming language

I plan to talk about some of these proposals in my Cutting Edge XML Programming post-conference tutorial at the XMLOne conference in Austin next month.


Sun's released version 1.1 of JAXP, the Java API for XML Processing. This is essentially SAX2, DOM2 Core, Trax, and a few factory clases for finding a parser.


Mozilla 0.8 has been released for the usual batch of platforms. Mozilla supports XML, CSS, and simple XLinks, among many other features. New features in this release include Find and Replace.

Thursday, February 15, 2001

Robert X. Cringely has retracted last week's claim, which I reported here, that Adobe has laid off the Framemaker team. Apparently Adobe was quite demonstrative in their personal response to him that there was nothing to the story. It may have been based on an old Internet rumor. The furor over the alleged layoffs did bring one interesting fact to light that I was not aware of before. According to Bob Ducharme, FrameMaker is much better at exporting XML than at importing it. Maybe now the still-hard-at-work FrameMaker team can add this for the next release?

Wednesday, February 14, 2001

Bill LaForge has released version 4 of Quick. Quick is a data modeling system for Java that can generate and process XML by converting arbitrary object structures into trees of XML elements. Quick works with Java Beans and Bean Property Editors. Changes in this release include :

  • QJML now serves as both a Java Data Modeling language and as an XML binding schema.
  • Full Java inheritance is supported in the XML schema.
  • OCM, a component system for performing complex XML transformations.

Mike Brown and Jeni Tennison have released the ASCII XML Tree Viewer, an XSLT style sheet that shows the node structure of an XML document in the form of plain text "ASCII art".

Tuesday, February 13, 2001

The W3C XML Core Working Group has published the XML Fragment Interchange Candidate Recommendation. This spec attempts to address the question of what to do with pieces of XML documents that are not themselves complete well-formed XML documents and may be missing key components like entity declarations, encoding information, default attribute values, and namespace mappings. The Candidate Recommendation Phase is scheduled to end April 30, 2001.


The W3C DOM Working Group has published a new working draft of the Document Object Model (DOM) Level 3 Content Models and Load and Save Specification. The first part of this spec uses IDL to define interfaces for accessing grammars (DTDs and schemas) as object trees. The second part defines a platform independent means to parse and serialize XML documents. As far as I know, no parsers have really tried to implement this yet, though the Load and Save parts are partially derived from ideas in JAXP.

I'll be talking about new features in DOM3, including Content Models and Load and Save, on the last day of the XML One Conference in Austin next month. Time permitting, I may try to work some XML Fragment material into the session as well.

Monday, February 12, 2001

Netscape's released version 6.0.1 of their namesake web browser for Mac, Windows, and Unix. This is a bug fix release to improve stability and address a few user interface issues. XML and CSS support is pretty much the same as in version 6.0.


The Apache XML Project has released FOP 0.17, an XSL-FO-to-PDF converter written in Java. This release features numerous bug fixes, and much tighter conformance to the XSL-FO Candidate Recommendation. I haven't had time to test out all the new features, and the web site isn't updated yet, but I do remember that footnotes were promised for this release.

Sunday, February 11, 2001

Opera Software has released Opera 5.0 Beta 6 for Linux. This is an ad-supported web browser with support for XML+CSS. They've also updated the Windows version to 5.0.2. Opera is $39 without the ads.

Saturday, February 10, 2001

The Gnome Project has posted the first public beta of libxslt, an XSLT processor shared library for Linux. At this time libxslt is incomplete and undoubtedly buggy, but it's got enough to be useful. Extension functions and elementsd aren't really supported yet, but most standard XSLT elements work. A few attributes are missing here and there.

The Gnome Project has also updated the libxml XML parser shared library for Linux to version 2.3.0. This is a bug fix release.

Friday, February 9, 2001

The W3C has published an interesting note on Common User Agent Problems. While Microsoft and Netscape compete with each other in features, they've more or less ignored a number of glaring problems in their browsers. This document identifies myriad user-interface deficiencies in common web browsers, most of which have been problems for years. Examples include:

  • When the user follows a link to a target anchor, highlight the target location.
  • When the user requests to print a frameset, allow the user to select to print an individual frame or the frameset.
  • Allow the user to add new URI schemes in a straightforward way. (Protocol handlers anyone?)
  • When a Web resource includes metadata that may be recognized by the user agent, allow the user to view that metadata.
  • Allow the user to bookmark negotiated resources.
  • Respect the character set of a resource when one is explicitly given.

Stated this bluntly, these are really obvious problems. Perhaps this note will nudge vendors to address a few.


Several people wrote in to tell me that Adobe had responded to Cringely's article. It's important to note, however, that nothing in Adobe's press release actually says that Cringely was wrong. In particular, it does not deny that Adobe laid off the FrameMaker team. In my opinion, that missing denial is far more suggestive than the original report. I suppose it's possible Adobe laid off all or most of the FrameMaker team for reasons unrelated to a lack of commitment to the product, but that seems unlikely to me.

Thursday, February 8, 2001

I've posted a minor update to Chapter 16 of the XML Bible, XLinks. Mostly this version improves the formatting of the XML examples, but a few bugs and mistakes were fixed as well. The chapter is now current with the December 20, 2000 Proposed Recommendation of XLink.


IBM's posted refreshed versions of several products on alphaWorks including:

  • A couple of bugs have been fixed in the XML Lightweight Extractor (XLE). XLE allows a user to annotate a DTD to associate its various components with underlying data sources. Then it can extract data from the data sources and assembles the data into XML documents conforming to that DTD.

  • Version 2.0.0 of the LotusXSL XSLT transformation engine is based on Xalan-J 2.0.0

  • The DirectDOM Development Kit for Netscape 6 has been released, and the Development Kit for IE was updated. DirectDOM technology allows a Java developer to build applet GUIs using the browser UI instead of Swing or the AWT. Only Windows is supported.

Wednesday, February 7, 2001

I've posted a minor update to Chapter 15 of the XML Bible, XSL Formatting Objects. This just corrects a few minor errors and misconceptions. Mostly it addresses changes in the Candidate Recommendation draft that I missed on my first read-through a couple of months ago. Most significantly, boolean properties that used to have the values yes and no, now have the values true and false.


Nokia's released version 2.1 of the Nokia WAP Toolkit, a free-beer PC-based environment in which developers can write, test and de-bug WAP applications. This release adds a phone simulator is provided for the Nokia 6200 and 7100 Series phones. Registration is required.


Robert X. Cringely's reporting that Adobe has laid off the FrameMaker team. That doesn't bode well for future development and support of the XML-enabled word-processor/page layout program. FrameMaker has filled a unique niche in technical documents over the years. If it goes, we'll be down to TeX and XML.

Tuesday, February 6, 2001

Matt Seargeant's released AxKit 1.2, an XML application server based on mod_perl. You can find all the details you need to know about AxKit at It's also available on CPAN in the Apache modules directory.


The W3C Working Group has published the first public working draft of CSS Mobile Profile 1.0. This defines a subset of CSS2 tailored to the needs and constraints of memory, display, and bandwidth challenged devices like cell-phonnes and Palm Pilots.


The W3C has launched a new mailing list to discuss XSL Formatting Objects. To subscribe, send email to XSL-FO-request@w3.org. with the word "subscribe" in the subject.

Monday, February 5, 2001

The Apache XML Project has released Xalan-Java 2.0, an open source XSLT processor written in Java. This release supports XSLT 1.0 as well as assorted extension functions including SQL access to databases via JDBC, redirection of output, conversion of result-tree fragments to node-sets, set operations on node-sets, tokenizing strings, and more. THe major change in this release is support for TrAX, the Transformation API for XML. This API allows API-level users to code XML applications without reference to the internal details of a particular processor or XML parser.

Saturday, February 3, 2001

The W3C SVG Working Group has posted the second public release of the SVG test suite. The results seem to say that the latest CVS version of Batik is the most compliant for static SVG, and that the Adobe SVG Plug-in 2.0 is most compliant for animations. Of course, all the products tested are in beta, so everyone still has a lot of work to do.


Sun's published a draft of the XML file formats for OpenOffice, a.k.a. the open source version of Star Office 6.

Friday, February 2, 2001

The W3C Core XML working group has published the last call working draft of the XML InfoSet. Last Call Ends February 23.

Thursday, February 1, 2001

The XML Apache Project has released Xerces-J 1.3.0. The big new feature in this release is upgrading schema support to partial compliance with the W3C Schema Candidate Recommendation of October 24, 2000. There are a few other bug fixes and optimizations as well.


The XML Apache Project has also released Xerces-C 1.4.0. New features of this include a SAX2 LexicalHandler, an optimized DOM Implementation, and many big fixes. However Xerces-C does not support schemas yet.

Wednesday, January 31, 2001

IBM's alphaWorks has refreshed their XML Security Suite. The XML Security Suite provides digital signatures, element-wise encryption, and access control. mIt is written in Java, and should run on most Java compatible platforms. This release adds an improved API document for XACL, along with better KeyInfo handling of XML-Signature, a programing how-to guide, and some bug fixes.

Tuesday, January 30, 2001

The W3C DOM Working Group has published the first public working draft of Document Object Model (DOM) Level 3 Core Specification Version 1.0. The current document is quite incomplete, but eventually it will define the DOM3 versions of the basic DOM classes that represent the parts of an XML document (e.g. Element, Attr, Text, etc.). New features in DOM3 core include:

  • DOMKeys, unique keys generated by the DOM implementation for each DOM node

  • The version and encoding of an external parsed entity, as provided in text declarations

  • The version, encoding, and standalone status of a document, as specified in the XML declarations

  • An adoptNode() method to move a node from one document to another.

  • Several new methods in the Node interface including:

    public TreePosition compareTreePosition(in Node other) throws DOMException
    public boolean isSameNode(Node other)
    public String lookupNamespacePrefix(String namespaceURI)
    public String lookupNamespaceURI(String prefix)
    public void normalizeNS()
    public boolean equalsNode(Node n, boolean deep)
    

The W3C WCAG Working Group has published the first public working draft of the Web Content Accessibility Guidelines 2.0.


The W3C Internationalization Working Group and the IETF Internationalization Interest Group have published the last call working draft of Character Model for the World Wide Web 1.0. This document provides authors of specifications, software developers, and content developers with a common reference for consistent, interoperable text manipulation on the Web. Topics addressed include encoding identification, early uniform normalization, string identity matching, string indexing, and URI conventions. Some introductory material on characters and character encodings is also included. Last Call ends February 23, 2001.

Monday, January 29, 2001

Unicorn Enterprises SA has released the Unicorn XML Toolkit 1.00.00 for C++ and ECMAScript on Windows. This is a collection of XML libraries and utilities iincluding

  • Non-validating XML parser
  • XML writers supporting XML, HTML and text output
  • Document Object Model (DOM) Core Level 1
  • SAX2
  • XPath version 1.0
  • XSLT version 1.0
  • XSLT extensions for:
    • Advanced grouping
    • Import of plain data (separated and fixed-sized records)
    • Database connectivity
    • Interface with ECMAScript
    • Support for ActiveX automation
  • A TeX-based implementation of XSL Formatting Objects
  • An ECMAScript interpreter
  • XML ECMAScript extensions including:
    • DOM support
    • XPath support
    • XSLT support
    • Support for ActiveX automation
    • I/O support for Unicode files
  • Regular expressions
Saturday, January 27, 2001

The XML Apache Project has posted version 1.8.2 of the Cocoon application server. This is a bug fix release. Cocoon 1.8.2 is now available for download: Among other things, Cocoon works on Java 1.1 again, and the XInclude processor should work correctly at any stage in the pipeline. The optional connectors for XT and JNDI have been restored.


The W3C CSS working group has published the last call working draft of CSS3 module: W3C selectors. This is a major upgrade form CSS2 selectors with a lot of new functionality. It takes much more explicit account of XML support, including namespaces and case-sensitivity. Comments are due by March 1, 2001.

Friday, January 26, 2001
Book Cover, Peacock

Amazon.com got XML in a Nutshell in stock today, and promptly ran out while filling pre-orders. It is now listed on 4-6 week availability, but they'll probably get more much sooner than that, if you want to preorder it. The following bookstores also have it in stock for shipment in 2-4 days:

In the meantime I've begin posting various material from the book for your perusal, including:

I still have to post the examples from a few of the later chapters, but most of them are available now.


Thursday, January 25, 2001

Mike Olson has released 4XDebug, an interactive XSLT debugger, and 4XProf, an XSLT profiler. The debugger has all the typical debugger features: stepping through XSLT instructions, setting breakpoints, displaying arbitrary expressions against the current context, etc. The profiler breaks down performance by patterns, templates, paths, portions of path (etc. How long are my descendant-or-self axis specifiers taking?)


The XML Authoring Environment for Emacs (XAE) is a free software package that allows you to use Emacs and your system's HTML browser to create, transform, and display XML documents. The XAE includes:

  • structured document editing mode (psgml) for Emacs
  • technical book and article DTD (Docbook)
  • Docbook stylesheets
  • XSLT processor (Saxon)
  • XAE and Docbook user's guides
Wednesday, January 24, 2001

SWAG, the Semantic Web Agreement Group is self-organizing with a goal of "creating a strong infrastructure for the Semantic Web, whilst working with various members of the Web community to ensure that data remains interoperable." SWAG's current focus is the compilation of the SWAG Dictionary, a database of terms for the Semantic Web, and the creation of new vocabularies. You can subscribe to the SWAG development list by sending a blank email to swag-dev-subscribe@egroups.com.

Tuesday, January 23, 2001

The Unicode Consortium has published the first public working draft of Unicode 3.1. The primary new feature of Unicode 3.1 are 44,946 new characters covering several historic scripts, several sets of symbols, and many additional CJK ideographs. For the first time, characters are encoded beyond the original 16-bit codespace or Basic Multilingual Plane (BMP or Plane 0) in code positions of U+10000 and higher. In particular, there are three new supplementary planes:

Supplementary Multilingual Plane (SMP) U+10000..U+1FFFF
Historic scripts and symbols: Old Italic, Gothic, Deseret, Byzantine Musical Symbols, (Western) Musical Symbols, and Mathematical Alphanumeric Symbols
Supplementary Ideographic Plane (SIP) U+20000..U+2FFFF
Additional unified Han ideographs known as Vertical Extension B, comprising 42,711 characters, as well as 542 additional CJK Compatibility ideographs
Supplementary Special-purpose Plane (SSP) U+E0000..U+EFFFF
97 tag characters. These duplicate the ASCII graphic characters at different code points. However, they indicate markup rather than text. In other words, instead of writing >P< to indicate a paragraph, you might just write P. However, you'd use the P at U+E0070 instead of the P at 0x50. (Note: neither XML or HTML uses this. It's intended to be used for future standards.)

The creators of XML specification foresaw these developments, and the XMlm specification is fairly compliant with these new characters. However, not all parsers may be compliant yet.


Dave Beckett's posted an alpha release of the open source Rapier RDF Parser 0.9.0 written in C.


Jim Fuller's written an amusing XPath cheat sheet Flash applet.

Monday, January 22, 2001
Book Cover, Peacock

I'm pleased to announce the imminent release of XML in a Nutshell by W. Scott Means and Elliotte Rusty Harold (i.e. me). This is a book a lot of developers have wanted for a long time now. Over a year before the contract was signed, readers were showing up at the O'Reilly booth at Internet World asking when XML in a Nutshell was going to be released. There were a few sputters and starts along the way, and like all such books multiple deadlines were missed; but I'm pleased to announce that thousands of copies of XML in a Nutshell have rolled off the printing press, been perfect bound, packed in cardboard boxes, and shipped to bookstores around the world, where they should begin arriving tomorrow. I think you'll like it.

One of my favorite comments about The XML Bible came from a reader in Norwich England who wrote, "It would seem to me that if you asked the author to write 10,000 words about the colour blue, he would be able to do it without breaking into a sweat." You know, I probably could write 10,000 words about blue, but I can write short books too, and XML in a Nutshell is the book that proves it. I'd estimate that it covers over twice the material that the XML Bible does in less than half the space and at just about half the price. In fact, XML in a Nutshell even weighs less than half what the XML Bible weighs, so not only will it not break your budget; it won't break your back either. (Whether I can write this concisely without the able aid of my coauthor W. Scott Means is still an open question.) I still like the XML Bible. I think it's a good book, but even I have to admit that I think twice before packing it in my carry-on luggage.

XML in a Nutshell, is a complete introduction to the state of the art in XML as of early 2001 including well-formedness, DTDs, namespaces, XLinks, XPointers, XPath, XHTML, XSLT, XSL-FO, SAX2, DOM2, Unicode, and more. Very few XML books even attempt to cover this much material, and I guarantee you that none of them do it in this few pages. There is simply no quicker way to learn everything you need to know about XML than by reading this book. It is the most concentrated, cost-effective way to educate yourself about XML.

For those readers who've already learned everything you need to know about XML, I know of no better reference to remind you of the things you've forgotten. Part IV contains detailed references for XML, XSLT, SAX2, DOM2, XPath, and Unicode; all carefully designed to facilitate fast look-up when you just can't quite remember the name of that XSLT element or the exact signature of that SAX method. Before Scott and I wrote this book, I wasted way too much time searching the specifications of XML, XSLT, DOM and more for little details like the proper namespace for SVG. Now I just flip open XML in a Nutshell, and the answers I need are right there. We wrote the reference work I always wanted to have.

Now for the bad news: bookstores have only preordered 17,000 of these. While that sounds like a lot, more than 20,000 people a day read Cafe con Leche, which means bookstores are going to be caught at least a few thousand copies short, and probably more. Amazon, in particular, always manages to underestimate the demand for my books and sells out their initial allotment within an hour or two of me announcing one here. If you see that Amazon is listing it on "4-6 weeks availability" don't worry. O'Reilly can get them more a lot faster than that. Still, if you know you want this book, don't wait. Order it now. It will be available very soon from any bookstore that carries computer books including:

If you need to special order it, the ISBN number is 0-596-00058-8. It's $29.95, published by O'Reilly, and written by Elliotte Rusty Harold and W. Scott Means.

Sunday, January 21, 2001

The W3C XML Core Working Group has promoted Canonical XML to a Proposed Recommendation. This specification describes an algorithm for generating a byte sequence from an XML document called the canonical form. The canonical form normalizes a number of insignificant details like attribute order, whether single or double quotes are used to surround white space, and so forth. Two XML documents with the same canonical form can be considered equal for all intents and purposes. This version is pretty much the same as the Candidate Recommendation. Review ends February 16.

Saturday, January 20, 2001

Sun's posted a Java Specification Request (JSR) for a Java API for XML Registries 1.0 (JAXR). Acccording to the JSR abstract, "JAXR provides an API for a set of distributed Registry Services that enables business-to-business integration between business enterprises, using the protocols being defined by ebXML.org, Oasis, ISO 11179." The JSR further claims that, "JAXR may be viewed as analogous to Java Naming and Directory Interface (JNDI) but designed specifically for internet sharing of XML-related business information." Review closes on Jabuary 22, 2001. (Monday)

Friday, January 19, 2001

The W3C CSS&FP working group has published a new working draft of CSS3 introduction. This provides short descriptions of each of the planned modules of CSS3 including:

  • Syntax / grammar
  • Selectors
  • Values & units
  • Value assignment / cascade / inheritance
  • Box model / vertical
  • Positioning
  • Color / gamma / color profiles
  • Colors and Backgrounds
  • Line box model
  • Text / bidi / vertical alignment
  • Fonts
  • Ruby
  • Generated content / markers
  • Replaced content
  • Paged media
  • User interface
  • WebFonts
  • Aural Style Sheets
  • SMIL
  • Tables
  • Columns
  • SVG
  • Math
  • BECSS
  • Test Suite

The Apache Cocoon project has released version 1.8.1 of the Cocoon application server. This is mostly a bugfix release. New features include:

  • First official release of the esql logicsheet for database access in Cocoon.
  • Updated installation instructions for a number of servlet engines, including Tomcat.
  • Support for FOP 0.15 (though not later versions).
  • A new LinkEncodingProcessor to encode all links on a page

IBM's alphaWorks has released version 3.1.1 of their XML Parser for Java (which is based on Xerces-J). The main changes are in performance and thread safety.


Jasc Software has posted the fourth beta of WebDraw (formerly known as Trajectory Pro), an SVG authoring program for Windows 98/NT4/2000/ME. New features in this release include:

  • Integrated source code editor for text-based editing of SVG files
  • Customizable display of SVG tags and attributes in Source view
  • Filter effects editor for visual editing of SVG filters and filter primitive properties
  • Enhanced Select tool for easier object selection and deformations
  • Support for displaying grids and guides
  • Interface customization options, including user-defined keyboard shortcuts and menus
  • Significant program performance enhancements
  • Support for the November 2, 2000 SVG 1.0 Candidate Recommendation specification
Thursday, January 18, 2001

How you phrase requests is sometimes important. I once heard a story about a woman who put up some flyers in her neighborhood advertsing, "For Adoption: eight cute puppies and one ugly one." That one ugly puppy got adopted nine times! After yesterday's claim that WML did not actually work, I finally got multiple submissions of photos of actual WML phones displaying my Hello WML example. Thanks to everyone who submitted. The best photo was probably this one from Reggie Dablo:

A Sprint cell phone showing the words "Hello WML"

I'm still not convinced that WML is useful or usable, or that anyone who's not developing for it is actually using it on an even semi-regular basis, but it is at least technologically possible. What's worst is that the WAP community seems to have its head in the sand about usability issues. The Nielsen-Norman group recently did a major study pointing out the problems with WAP. The WAP Forum's response sounds to me like a desperate effort to spin the press and investors rather than a serious acknowledgement of the very real problems that exist.

There are a number of ways wireless Internet access could be fixed, though it's not clear whether WAP/WML can be. In rough order of short-term feasability they are:

  1. A device with a small keypad, and a slightly larger, higher resolution screen; e.g. BlackBerry. This would eliminate the existing problems with typing letters on a phone keypad. Typing in URLs like http://www.ibiblio.org/xml/wml/24-1.wml can be extremely difficult on a phone. In fact, I'd say anything beyond a three-letter airport code or a four-letter stock symbol is too much to ask of users on existing phones.

  2. A device with a much larger, higher resolution screen and pen based input; e.g. a Palm Pilot. You can get real work done on a Palm Pilot. You can't on a cell phone.

  3. A browser that uses the one media type that's actually appropriate for cell phones: audio. All commands should be input via voice. All pages should be read to the users. This would require more CPU horsepower and memory, but that's coming.

  4. Wearable computers with gesture input and phenomenally large heads-up displays built into glasses or contact lenses. (Eye cancer might be a bit of a problem, though.)

ZDNet is running an interesting story about future cell phone developments.


The W3C has published the first public working draft of Multi-column layout in CSS. This proposal defines three groups of new properties to support multi-column layouts in CSS3. The first group sets the number and width of the columns:

  • column-count
  • column-width
  • column-min-width
  • column-width-policy

Properties in the second group specify the amount of space and rules between columns:

  • column-gap
  • column-rule
  • column-rule-color
  • column-rule-style
  • column-rule-width

The third group contains a single property that lets an element span multiple columns:

  • column-span

The IETF has published RFC 3023, XML Media Types by Makoto Murata, Simon St.Laurent, and Daniel Kohn as a This document standardizes five new MIME media types:

  • text/xml
  • application/xml
  • text/xml-external-parsed-entity
  • application/xml-external-parsed-entity
  • application/xml-dtd

This document also standardizes the convention of using the suffix +xml for naming media types for specific XML applications. For instance, RDDL documents would have the type text/rddl+xml.

Meanwhile M. Baker has submitted an IETF internet draft (the pre-RFC stage) on the 'application/xhtml+xml' Media Type. This specification also introduces a new parameter called schema-location that would identify where the schema for this document can be found. For example, an HTTP server might include this Content-type header when returning an XHTML document:

Content-type: application/xhtml+xml; schema-location="http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd"

This approach would seem to have implications beyond XHTML. It could easily be used to locate schemas for any type of XML document returned over HTTP.


AlphaXML is sponsoring a contest to write the the best Haiku on the topic of XML. The prize is a copy of Turbo XML.


Andrew Watt's launched an XSL Formatting Objects Mailing List. You can subscribe by sending email to XSL-FO-subscribe@egroups.com.

Wednesday, January 17, 2001

I am going to propose a falsifiable hypothesis: The Wireless Web does not exist. There are many developers but no actual users. I'm hypothesizing this because of the 20,000 people a day who read this site, not one seems to have a cell phone capable of viewing this most basic WML page. To falsify my hypothesis and prove me wrong all that's needed is to take a good picture of a real cell phone browsing that page.

After hours of trying to load this page on my own phone, and much converstion with my cell phone company's technical support, VoiceStream finally admitted that, contrary to what I was promised when I initially signed up with them, the phone they sold me did not actually support WAP and WML. Then they told me to trade it in for a new phone, but they couldn't provide me with the model that would actually do what they had promised; and after several weeks of searching, it doesn't seem that anyone else can either. Several people sent me screen shots of a cell phone simulator viewing this page, but so far not one person has actually sent in a real photograph of a real phone displaying this page. One person tells me he tried, but that the flash on his camera washed out the display! I feel like I'm in UFO-land here. Everyone believes WML exists but nobody can actually prove it.


Opera Software has released version 5.0.2 of their namesake Opera Web browser for Windows. Opera supports XML, CSS, and WML. Version 5.0.2 adds numerous minor enhancements and features. Opera 5.0 is adware/payware (your choice).

Tuesday, January 16, 2001

I'm still looking for a photo of this Hello World page on a cell phone. Please note: I need an actual photo of a real phone. A screen capture of an emulator is not enough. (I wonder what it says about the adoption of WML when I'm getting multiple submissions of emulator screenshots, but no pictures of actual phones? I suspect it means this technology is in even worse shape than I thought. Everyone's developing for it, but nobody's using it.)


eXist, an open source repository and search engine for XML documents, is now in public alpha release. It comes XPath and fulltext search. MySQL is used as the backend.


Fourthought, Inc. has posted version 010.1 of 4Suite, a collection of open source tools for standards-based XML, DOM, XPath, XSLT, RDF XPointer, XLink object-database, and XInclude development in Python Components include 4DOM, 4XPath and 4XSLT, 4RDF, 4ODS, 4XPointer, 4XLink and DbDOM. New features and fixes in this release include:

  • PyXML (0.6.3 + fixes) is now built in
  • XInclude support
  • More thorough XSLT test harness
  • Support source docs from stdin on 4xslt command line
  • Implement unparsed-entity-uri
  • XSLT: Restricted HTML writer output allowed as security tool
  • New XPath extension functions including evaluate(), distinct(), split(), range(), if(), and find()
  • Update to 2000-11-13 DOM level 2 recomendation
  • Documentation updates and consolidation
  • Domlette reader option to force 8-bit DOM strings even in Python 2.0
  • Organize Reader and URI handler APIs to allow easier customizations
  • Many Python 1.5.2 and 2.0 compatibility fixes
  • Assorted optimizationsand bug-fixes

Jeni Tennison's posted a new beta of XSLTDoc, a JavaDoc like Windows program for XSLT stylesheets. The somewhat improved beta version is now MSXML3.0 (release version) installed in replace mode and IE5+ is required.

Monday, January 15, 2001

I'm still looking for a photo of this Hello World page on a cell phone.


James Clark has updated his TREX schema language to refine the way it handles merging included grammars. He's also written a complete implementation of XHTML modularization in TREX, something that still doesn't exist for W3C XML Schemas.


Jonathan Borden's posted a new draft of the Resource Description and Discovery Language (RDDL). RDDL is a modular XHTML and XLInk based application for documents placed at the end of namespace URIs which allows software to automatically discover and retrieve DTDs, schemas, style sheets, and other resources for particular XML applications. The major change in this draft at the URL that used to be the value of xlink:arcrole is now the value of xlink:role, and xlink:arcrole can be used to provide an additional URL further refining the role. Furthermore, MIME types like text/css can now be used as roles through their canonical URLs at the IETF such as http://www.isi.edu/in-notes/iana/assignments/media-types/text/css. Finally, RELAX and TREX schemas for RDDL havve been added as one more kind of related resource for this document.


Andrew Watt's opened www.SVGspider.com which is perhaps the world's first "all SVG" web site. The Adobe SVG Browser Plug-in is required.

Sunday, January 14, 2001

I have unusual request today. Would somebody with a WAP-enabled cell-phone and a digital camera please take a picture of this Hello World page on their phone and send it to me? The URL is http://www.ibiblio.org/xml/wml/24-1.wml I'd do it myself, but it seems that VoiceStream's rather clueless representatives misinformed me about which of their phones supported WAP.

This is for a new chapter about the Wireless Markup Language in the second edition of the XML Bible. The first person to send me a usable picture will get a free copy of the second edition of the XML Bible when published later this year. You'll need to sign a not-particularly onerous photo permissions agreement giving IDG non-exclusive rights to publish the picture. The agreement doesn't promise a photo credit, but I'll try and make sure you get one. Please take the picture at maximum resolution and quality, since this is intended for print rather than a web page. Thanks!

Saturday, January 13, 2001

The XML Apache Project has posted Xalan-Java 2.0.D07. Xalan-Java is an XSLT processor written in Java. Along with performance enhancements and bug fixes, this release adds a compatibility layer that lets you rebuild your existing Xalan-Java 1.x applications to take advantage of the performance and conformance enhancements in Xalan-Java 2.


Enhydra logo

Version 3.1 of the Enhydra application server has been released. This release focuses on XMLC, adding compile time includes, updated XML and HTML parsers (Xerces v1.2 & HTML Tidy), a Lazy DOM implementation, the Xalan XSLT parser, and assorted bug fixes.


The fourth beta of the XML Spy 3.5 tree-based XML editor has been released. New features in this beta inlcude:

  • validation of XML Schemas with integrated error highlighting directly within the Schema design view
  • access and manipulate files in any respository that is accessible through an ftp:, http:, or https: URL
  • browse and manipulate folders directly from the Open/Save URL dialog on any FTP or WebDAV server
  • full support for xsi:type
  • full support for xsd:list
  • full support for whiteSpace facet
  • full support for Unicode character classes and groups in regular expressions
  • new online help with lots of improvements in the existing chapters and new coverage of XML Schema design
  • support for XML Spy add-ins through the component download area
  • improved support for Microsoft Source-Safe (and compatible repositories)
  • improved COM-based API (more functionality) and

In addition, a lot of bugs in the schema support have been fixed, and the schema validation has been optimized for speed and memory usage.

Friday, January 12, 2001

Enhydra's released the stand-alone version of XMLC 2.0. XMLC is a web application framework that "provides high level APIs to DOM and servlet 2.2 technologies." This is the version of XMLC that was bundled with Enhydra 3.1. It adds all of the classes necessary to run XMLC in a non-Enhydra environment. New features include:

  • A LazyDOM that avoids instantiating nodes that are not accessed
  • Pre-formatted text on output.
  • Server-side includes
  • XMLC meta-data files for compiler options
  • Optional creation of getTagXXX() methods.
  • Namespaces support
Thursday, January 11, 2001

Sebastian Rahtz has released PassiveTeX 1.4, his XSL-FO-to-TeX converter.


Zvon's published their first attempt at a W3C XML Schema Reference.


Readers in Silicon Valley may be interested in an upcoming Sun Headquarter Briefing on Java Technology + XML, Wednesday January 31, 2001 in Menlo Park. You can register online or call (800) 795-7578.


Norm Walsh has published a W3C XML schema for DocBook.

Wednesday, January 10, 2001

Mozilla 0.7 has been released. Version 0.7 bundles the Personal Security Manager so secure sites should work now. There are lots of other bug fixes and minor feature improvements.


IBM's alphaWorks has upgraded Data Descriptors by Example (DDbE) to support the October 24 XML Schema Candidate Recommendation. DDbE is a Java program that generates schemas and DTDs from representative sample documents.

Tuesday, January 9, 2001

The W3C XML Linking Working Group has pushed the XPointer specification back to working draft status. The specific issue that was uncovered during Candidate Recommendation was some confusion over how to integrate XPointers, particularly those in non-XML documents, with namespaces.

It's also come to light in this draft that Sun has claimed a patent on some of the technologies needed to implement XPointer. I think this is particularly offensive because Eve L. Maler, a Sun employee, served as co-chair of the XML Linking Working Group and a co-editor of the XPointer specification. As usual Sun wants to use this as a club to lock implementers and users into a licensing agreement that goes beyond what Sun and the W3C could otherwise demand. The specific patent is United States Patent No. 5,659,729, Method and system for implementing hypertext scroll attributes, issued to Jakob Nielsen in 1997. The patent was filed on February 1, 1996. It claims:

Embodiments of the present invention use a new extension to the HTML language to support remotely specified named anchors. A remotely specified named anchor, when embedded within a source document, instructs a browser program to access a portion of a destination document indicated in the remotely specified named anchor. When the browser program reads a remotely specified named anchor such as <a href=http://foo.com/bar.html/SCROLL="Some Text"> from the source document, the browser program performs the following steps: 1) the browser retrieves the destination file "bar.html" from the server "foo.com", 2) the browser searches the file bar.html for "Some Text", and 3) if the browser finds the character swing being searched for, then the browser displays the file bar.html, scrolled to the line containing the first character of the character string being searched for.

It's very questionable whether this is truly an original invention with no prior art. HyperCard and Xanadu both had capabilities like this. It's also questionable whether the patent as written really applies to XPointer. For instance, the patent mandates a certain behavior of browsers. XPointer doesn't. I also think that the proposed "XPointer patent terms and conditions" are unenforceable as currently published. Last Call on this draft ends January 29, 2001. I recommend complete rejection of this specification until such time as Sun's patent can be dealt with more reasonably.


The W3C Math working group has promoted MathML 2.0 to Proposed Recommendation. Changes in this draft are quite minor, and mostly editorial or corrections of minor inconsistencies. Review Ends February 5, 2001

Zvon's updated their MathML reference to reflect the Proposed Recommendation.

Monday, January 8, 2001

James Clark of XSLT fame has thrown his hat into the ring for developers of XML Schema languages. According to Clark, TREX (Tree Regular Expressions for XML) is "basically the type system of XDuce with an XML syntax and with a bunch of additional features (like support for attributes and namespaces) needed to make it a practical language for structure validation. Of existing Schema languages, it's closest to RELAX. It's not tied to any particular datatyping language; rather, the idea is that you can plug whatever datatyping language you want".


RenderX has released version 2.0.1 of the XEP XSL Formatting Objects to PDF converter program. Changes include:

  • supports the XSL FO Candidate Recommendation of November 21, 2000
  • additional XSL FO functionality has been implemented
  • The native PDFlib library has been replaced by pure Java
  • User fonts can be embedded
  • Documentation includes an XSL FO primer and more examples
  • An Ant-based installation script is provided
  • New tests have been added to the test suite including a number of non-English examples
Sunday, January 7, 2001

Brendan Macmillan's Java Serialization to XML, JSX (current version 0.71) generates XML documents that represent Java objects and vice versa.

Saturday, January 6, 2001

The XML Apache Project has posted Xalan-Java 2.0.D06. This is a beta Developer's release of the popular open source XSLT processor. Xalan-Java 2.0.D06 supports XSLT 1.0, XPath 1.0, and the Java Transformation API for XML (TrAX). Version 2.0D6 adds some performance and error-handling enhancements along with bug fixes.


Rick Maddy's released 0.92 of XSLDoc, a JavaDoc like program for documenting XSLT style sheets. This release enables

  • Display of stylesheet parameters in the stylesheet description area.
  • Display of the XSLT specification version in the stylesheet description area.
  • Display of stylesheet output method in the stylesheet description area.
  • Support for files with the .xslt file name extension

The third beta of the XML Spy 3.5 tree-based XML editor has been released. New features in this beta inlcude:

  • Microsoft Source-Safe support
  • Aew Scripting Environment and Form editor
  • More COM functionality and conformance with COM naming conventions)
  • Many bug fixes and improvements to the XML Schema support
  • Option to indent with spaces rather than tabs

The download is several megabytes. XML Spy is $199 payware. The beta is free, but beta testers will likely not receive the customary free copy of the release version for doing unpaid work for the software vendor.

Friday, January 5, 2001

The W3C Speech Interface Framework Working Group has published three new working drafts, including two in last call and one new one:

Thursday, January 4, 2001

Adobe's posted the second beta the Adobe SVG Viewer 2.0, a plug-in for Netscape and IE on both Mac and Windows that supports much of the candidate recommendation of Scalable Vector Graphics.

Wednesday, January 3, 2001

Multiple competing proposals for an XML Catalog syntax are being posted on xml-dev every day. The goal is to come up with something reasonably extensible and indirect to place on the other end of namespace URIs. Sean Palmer's posted a proposal for XML Namespace Gloss, XNGloss. Most recently Jonathan Borden and Tim Bray have combined forcces for the Resource Directory Description Language (RDDL). According to Borden,

A Resource Directory provides a text description of some class of resources and of other resources related to that class. It also contains a directory of links to these related resources. An example of a class of resources is that defined by an XML Namespace. Examples of such related resources include schemas, stylesheets, and executable code. A Resource Directory Description is designed to be suitable for service as the body of a resource returned by deferencing a URI serving as an XML Namespace name.

The Resource Directory Description Language is an extension of XHTML Basic 1.0 with a new element named resource. This element serves as an XLink to the referenced resource.

Everybody's copying the ideas they like from everybody else, so it's hard to keep track of who's doing what. However, there does seem to be some real disagreement about both functionality and syntax, so there are some real differences between the various proposals.


Howard Katz's XML Query Engine v0.92 is a JavaBean that lets you index and search XML documents for element, attribute, and full-text content. The query language is a draft form of XQL, which is close to a subset of XPath. XML Query Engine extends XQL's syntax to provide a full-text search. Licensing remains to be worked out, but it's not open source. At least for the moment, it's free-beer.

Tuesday, January 2, 2001

Sun's launched a mailing list to discuss the Java XML Messaging (JAXM) API and the M Project early access prototype. To subscribe, send email to listserv@java.sun.com with an empty subject line and the line "subscribe jaxm-interest" (without the quotes) in the body of your message.


An XML Catalog specification is getting hashed out on xml-dev by Jonathan Borden and others. The goal is to come up with something reasonably extensible and indirect to place on the other end of namespace URIs. The most interesting part to me is how this proposal uses both XHTML, XLink, and namespaces to be both human and machine readable.


Zvon's published several new hypertext references including:


The Unicode Consortium has put the complete text of the Unicode 3.0 Specification online. Previously this was only available as a printed hardcover book.

Monday, January 1, 2001

Happy New Year! Happy New Millennium! May this one find you well. Today I thought I'd update you on my progress on last year's New Year's resolutions for Cafe au Lait/Cafe con Leche.

1. I will move these sites onto their own dedicated Linux boxes and 24/7 net connections so I won't be limited by what the system administrators are willing to install.
It took most of last year to get the physical infrastructure in place. However, that has now been accomplished. My SDSL line is installed and running, and I've got a spanking-new Qube 3 ready to go. I've also purchased the domain names cafeaulait.org and cafeconleche.org. I'm going to be spending today trying to get the Web server working like I want. I'll be moving pages over one-at-a-time to the new system, starting with the least popular pages so I can work out the inevitable glitches. I expect this to be my major project for the next month or so.
2. I will establish a database back end for this site. No more editing the HTML pages manually!
Once I've figured out how to get the Qube to serve static pages to the world, I'll start using MySQL and PHP. I think I'll begin by porting the mailing lists pages and the books pages from FileMaker Pro to MySQL. Then I can begin moving the daily news into a database backed system.
3. I will get discussion forums working on this site. I'm tired of reading all the interesting responses I get in my email and not being able to share them with you.
I'm still looking at this. I may try using PHPNuke, or I may try to roll my own using MySQL and PHP. But it's still on my ToDo list.
4. I will get Prentice Hall to honor their contract reverting the rights to the Java Developer's Resource to me. Once that's done, I will update the book to Java 2, and post the entire text here on Cafe au Lait.
This was my biggest success. I do have the rights back to the Java Developer's Resource and I have begun to post updated chapters. Much work remains to be done to bring the entire book up to spec with the current incarnation of Java.
5. I will get the rights back to Java Secrets as well, update it, and post it here on Cafe au Lait.
I also got the rights back to Java Secrets. I haven't had time to update it yet. However, since it isn't nearly as out-of-date as the Java Developer's Resource. I think maybe I should just take a quick look at it and post the unupdated files. I would need a good conversion tool to move Word documents to HTML.
Tuesday, January 2, 2001

Sun's launched a mailing list to discuss the Java XML Messaging (JAXM) API and the M Project early access prototype. To subscribe, send email to listserv@java.sun.com with an empty subject line and the line "subscribe jaxm-interest" (without the quotes) in the body of your message.


An XML Catalog specification is getting hashed out on xml-dev by Jonathan Borden and others. The goal is to come up with something reasonably extensible and indirect to place on the other end of namespace URIs. The most interesting part to me is how this proposal uses both XHTML, XLink, and namespaces to be both human and machine readable.


Zvon's published several new hypertext references including:


The Unicode Consortium has put the complete text of the Unicode 3.0 Specification online. Previously this was only available as a printed hardcover book.

Monday, January 1, 2001

Happy New Year! Happy New Millennium! May this one find you well. Today I thought I'd update you on my progress on last year's New Year's resolutions for Cafe au Lait/Cafe con Leche.

1. I will move these sites onto their own dedicated Linux boxes and 24/7 net connections so I won't be limited by what the system administrators are willing to install.
It took most of last year to get the physical infrastructure in place. However, that has now been accomplished. My SDSL line is installed and running, and I've got a spanking-new Qube 3 ready to go. I've also purchased the domain names cafeaulait.org and cafeconleche.org. I'm going to be spending today trying to get the Web server working like I want. I'll be moving pages over one-at-a-time to the new system, starting with the least popular pages so I can work out the inevitable glitches. I expect this to be my major project for the next month or so.
2. I will establish a database back end for this site. No more editing the HTML pages manually!
Once I've figured out how to get the Qube to serve static pages to the world, I'll start using MySQL and PHP. I think I'll begin by porting the mailing lists pages and the books pages from FileMaker Pro to MySQL. Then I can begin moving the daily news into a database backed system.
3. I will get discussion forums working on this site. I'm tired of reading all the interesting responses I get in my email and not being able to share them with you.
I'm still looking at this. I may try using PHPNuke, or I may try to roll my own using MySQL and PHP. But it's still on my ToDo list.
4. I will get Prentice Hall to honor their contract reverting the rights to the Java Developer's Resource to me. Once that's done, I will update the book to Java 2, and post the entire text here on Cafe au Lait.
This was my biggest success. I do have the rights back to the Java Developer's Resource and I have begun to post updated chapters. Much work remains to be done to bring the entire book up to spec with the current incarnation of Java.
5. I will get the rights back to Java Secrets as well, update it, and post it here on Cafe au Lait.
I also got the rights back to Java Secrets. I haven't had time to update it yet. However, since it isn't nearly as out-of-date as the Java Developer's Resource. I think maybe I should just take a quick look at it and post the unupdated files. I would need a good conversion tool to move Word documents to HTML.

News from 2000 | News from 1998 | News from 1999
[ XML Books | XML Trade Shows | XML Mailing Lists | XML Quotes ]

Copyright 2001 Elliotte Rusty Harold
elharo@ibiblio.org
Last Modified March 14, 2001