Cafe con Leche XML News and Resources

Quote of the Day

The problem is that the specs keep changing in non-backwards compatible ways. Sure, I can implement WS-Security. Which version? Each version has it's own namespace. WS-Addressing, again, there have been several revisions, each with it's own namespace. Am I supposed to wait until I think they will never change again before I implement a solution using one of these specs? I may implement a solution today and have perfect interoperability today, but in 3 months, the spec will get a minor revision, a new namespace, and if any of the collaborative peers decides to move to the new spec, they can't talk to me anymore. What's even worse is when my software can't talk to a slightly old version of itself because we decide to use an implementation of WS-Whatever that is slightly newer.

I don't expect these things to not change. And on some level, changing the namespace seems like the obvious way to version the the spec, but it sure does throw a wrench into the real world where deployed software has to *stay* interoperable for years and not just months.

--Erv Walter
Read the rest in WS-* Specifications: Are there too many and are they too complex?

Do me a favor. Please take a look at the experimental version of this page that uses CSS instead of table layouts, and let me know if anything looks too funky to live with. If so, please let me know what browser on what platform you're using. It seems to work well on all the browsers I have conveniently available, but I haven't tested IE5 for Windows yet. Netscape 4 isn't great, but I was able to hack that enough so it isn't unreadable. One thing I still haven't figured out how to do is center the h1 header ("Cafe con Leche XML News and Resources") within the left hand panel. I can center it relative to the page, but that's not quite the same thing, especially in a wide window.

Amazon has reduced the price of XML in a Nutshell, 3rd edition to $27.17, a 32% savings off the cover price. Be the first on your block to get one!

I've been spending a lot of time reviewing RSS readers lately, and overall they're a pretty poor lot. Latest example. Yesterday's Cafe con Leche feed contained this completely legal title element:

<title>I'm very pleased to announce the publication of XML in a Nutshell, 3rd edition by myself and W.
          Scott Means, soon to be arriving at a fine bookseller near you.
          </title>

Note the line break in the middle of the title content. This confused at least two RSS readers even though there's nothing wrong with it according to the RSS 0.92 spec. Other features from my RSS feeds that have caused problems in the past include long titles, a single URL that points to several stories, and not including more than one day's worth of news in a feed.

Cafe con Leche and Cafe au Lait use XSLT to generate their RSS feeds, so they're always completely well-formed. The home pages are edited by hand, and may not always be well-formed; but if so the XSLT processor reports an error and does not generate a new RSS document. I really wish RSS vendors would focus on implementing the actual specs reasonably before they wasted time on supporting brain damage like malformed feeds and double escaped HTML. It's well-known that supporting non-conformant documents poisons the well for everyone. What's less well-known is that adding support for non-conformant documents tends to break the support for sites that actually follow the specifications. Everyone gets sucked into a race to the bottom, and we end up back in a world where everyone's browser handles sites just a little bit differently from everyone else's, and vendors compete based on how many broken sites they can make sense out of instead of how well they can present genuinely good data. This is the HTML hell XML was supposed to save us from. Those who forget the past are condemned to repeat it.

Permalink to Today's News
Recent News
Today's Java News on Cafe au Lait
Older News

Recent News

Thursday, September 23, 2004

I'm very pleased to announce the publication of XML in a Nutshell, 3rd edition by myself and W. Scott Means, soon to be arriving at a fine bookseller near you. XML in a Nutshell is quite simply the most complete and succinct treatment of the major technologies in XML you'll find anywhere. Topics covered include elements, attributes, syntax, namespaces, well-formedness, DTDs, schemas, XPath, XSLT, XSL-FO, CSS, SAX, DOM, internationalization, XHTML, and more. The third edition is a major update that syncs the book with the latest developments in XML including:

XML 1.1
XInclude
DOM Level 3
SAX 2.0.1
Unicode 4.0.1
XPointer 1.0
Namespaces 1.1

If you don't have a copy, you need a copy. Do you need to upgrade your old copy? If you're sticking to XML 1.0 (a recommendation I've made in my two previous books and continue to stand by in this one), the second edition will probably continue to serve you well. However, if you're still thumbing through a very dog-eared copy of the first edition, it's definitely time to upgrade. XML hasn't stood still in the three years since the first edition was published, and there's a lot of new and improved material here.

My author's copy arrived a couple of days ago, and generally it ships to me from the warehouse at the same time it ships to bookstores, just by slightly faster courier, so bookstores should have it in stock any day now. Amazon is still listing it at the full cover price of $39.95, but they normally drop that as soon as it gets in stock, so you may want to wait a day or two to order it. Update: Sometime today they dropped the price to $27.17, a 32% savings, plus they'r eoffering free supersaver shipping. They normally don't do this until they have the book in stock, so I expect it to arrive any day now. Go ahead and order it. Powell's, Barnes & Noble, and other bookstores should have it momentarily as well. It will also be available on Safari in the not too distant future for those readers who prefer their books in electronic format. The book is XML in a Nutshell, 3rd edition. The ISBN is 0-596-00292-0. It's published by O'Reilly, and written by W. Scott Mean and me, Elliotte Rusty Harold. Check it out!

Wednesday, September 22, 2004

Benjamin Pasero has posted version 0.9b of RSSOwl, an open source RSS reader written in Java and based on the SWT toolkit. This release can search a site for news feeds, improves PDF export, and can use Mozilla 1.7 as internal browser. It also fixes a number of user interface inconsistencies I reported.

Ranchero Software has posted a beta of NetNewsWire 2.0, a closed source RSS client for Mac OS X available in both free-beer lite and payware full versions. Version 2.0 removes the weblog editor and adds Atom support, flagged items, searching, news persistence, and an embedded browser.

JAPISoft has released JXP 1.3.3, a €139 payware XPath 1.0 API that can be customized to fit different object models. This release fixes bugs.

JAPISoft has also released FastParser 1.6.8, a $199 payware, non-validating, XML parser for Java that supports SAX and some of DOM. I'm very skeptical of this parser, and JAPISoft products in general. I notice that every other release they announce "new features" that are absolutely essential to any minimally conformant implementation of the technologies they claim to implement. It makes me wonder what's missing from the current version. Plus they are completely misusing the phrase "open source." These products are not available under an open source license, JAPISoft claims to the contrary not withstanding.

Kevin Howe has written a Ruby wrapper for HTML Tidy. It's distributed under the Ruby license.

Tuesday, September 21, 2004

Dave Beckett has released the Raptor RDF Parser Toolkit 1.3.3, an open source C library for parsing the RDF/XML, N-Triples. Turtke, and Atom Resource Description Framework formats. It uses expat or libxml2 as the underlying XML parser. Version 1.33 restores Unicode Normalization Form C checking, and fixes various bugs. Raptor is now dual licensed under the LGPL and Apache 2.0 licenses.

Mikhail Grushinskiy has posted XMLStarlet 0.95, a command line utility for Linux that exposes a lot of the functionality in libxml and libxslt including validation, pretty printing, and canonicalization. This release fixes some security bugs and has been recompiled against libxml2 2.6.13 and libxslt 1.1.10.

Sunday, September 19, 2004

I've posted beta 5 of XOM, my dual streaming/tree API for processing XML with Java. This beta primarily focuses on fixing bugs in XInclude and improving performance of builders when reading from files. It also deprecates the setNodeFactory() method in XSLTransform which will be removed in the next drop. In its place, there's a new constructor:

public XSLTransform(Document stylesheet, NodeFactory factory)

Finally, the four XSLTransform constructors deprecated in the last release have been removed.

I don't have any other major issues in the TODO list for 1.0. If nobody finds any bugs in this beta, I may label the next drop release candidate 1.

Saturday, September 18, 2004

The W3C Voice Browser Working Group has released the Recommendation of the Speech Synthesis Markup Language Version 1.0. According to the abstract, the Speech Synthesis Markup Language "is designed to provide a rich, XML-based markup language for assisting the generation of synthetic speech in Web and other applications. The essential role of the markup language is to provide authors of synthesizable content a standard way to control aspects of speech such as pronunciation, volume, pitch, rate, etc. across different synthesis-capable platforms."

Older News