XML News from Saturday, February 14, 2004

The XML Protocol Working Group has posted the first public working draft of the XML-binary Optimized Packaging (XOP) specification.

This specification defines the XML-binary Optimized Packaging (XOP) convention, a means of more efficiently serializing XML Query 1.0 and XPath 2.0 Data Model [XML Query Data Model] that have certain types of content.

A XOP package is created by placing a serialization of the XML Data Model inside of an extensible packaging format (such as MIME Multipart/Related, see [RFC 2387]) and then re-encoding selected portions of its content alongside it, while marking their locations in the XML with a special element that links to the packaged data using URIs.

Optimization in XOP is limited to the content of those elements which contain characters that can be interpreted as the canonical lexical representation of the XML Schema base64Binary datatype (see [XML Schema Part 2] 3.2.16 base64Binary and Errata in XML Schema, E2-54). Attributes, non-base64-compatible character data, and data not in the canonical representation of the base64Binary datatype cannot be successfully optimized by XOP.

Fortunately, this does not seem to be a generic binary encoding of XML, just a more efficient means of bundling non-XML binary data with XML documents.


The XML Protocol Working Group has also posted a new working draft of the SOAP Message Transmission Optimization Mechanism that relies on XOP. Quoting from the introduction,

Unlike SOAP itself, which is defined in terms of XML Infosets [XML InfoSet], this feature models message envelopes using the XQuery 1.0 and XPath 2.0 Data Model [XML Query Data Model], which is a typed superset of the Infoset. This feature uses type information only for optimization purposes; it does not provide for reconstruction of type information at receivers, except as necessary to support optimization. Nonetheless, use of the Data Model in this specification facilitates optimized transmission of query results through SOAP, and should provide a useful foundation if, for example, digital signature canonicalizations were to be developed for Data Model instances. Use of the Data Model here should also facilitate the work of those who may wish to develop features to provide for optimized transmission of the full typed Data Model: the changes needed to this specification should be straightforward, and the optimizations provided herein should be easy to generalize for such use.

The usage of the Abstract Transmission Optimization Feature is a hop-by-hop contract between a SOAP node and the next SOAP node in the SOAP message path, providing no mandatory convention for optimization of SOAP transmission through intermediaries. The feature does provide optional means by which binding implementations MAY choose to facilitate the efficient pass-through of optimized data contained within headers or bodies relayed by an intermediary (see 2.3.4 Binding Optimizations at Intermediaries). Additional specifications might also be written to provide for other optimized multi-hop capabilities, perhaps building on the mechanisms provided herein.

The second part (3. An Optimized MIME Multipart Serialization of SOAP Messages) describes an Optimized MIME Multipart Serialization of SOAP Messages implementing the Abstract Transmission Optimization Feature in a binding independent way. This implementation relies on the [XOP] format.

The third part (4. HTTP Transmission Optimization Feature) uses this Optimized MIME Multipart Serialization of SOAP Messages for describing an implementation of the Abstract Transmission Optimization Feature for the SOAP 1.2 HTTP binding (see [SOAP Part 2] 7. SOAP HTTP Binding).

I find the string typing implicit in this model to be seriously broken. It loses information (i.e. it's a lossy compression format) and makes too many assumptions about what content is and is not relevant.