The W3C Web API Working Group has published the second working draft of Language Bindings for DOM Specifications. "“Language Bindings for DOM Specifications” is intended to specify in detail the IDL language used by W3C specifications to define DOM interfaces, and to provide precise conformance requirements for ECMAScript and Java bindings of such interfaces. It is expected that this document acts as a guide to implementors of already-published DOM specifications, and that newly published DOM specifications reference this document to ensure conforming implementations of DOM interfaces are interoperable."
The W3C Semantic Web Activity has posted a ?working draft? of Experiences with the conversion of SenseLab databases to RDF/OWL. "One of the challenges facing Semantic Web for Health Care and Life Sciences is that of converting relational databases into Semantic Web format. The issues and the steps involved in such a conversion have not been well documented. To this end, we have created this document to describe the process of converting SenseLab databases into OWL. SenseLab is a collection of relational (Oracle) databases for neuroscientific research. The conversion of these databases into RDF/OWL format is an important step towards realizing the benefits of Semantic Web in integrative neuroscience research. This document describes how we represented some of the SenseLab databases in Resource Description Framework (RDF) and Web Ontology Language (OWL), and discusses the advantages and disadvantages of these representations. Our OWL representation is based on the reuse of existing standard OWL ontologies developed in the biomedical ontology communities. The purpose of this document is to share our implementation experience with the community."
Mildly interesting, but why this is working draft instead of a note, or why it's even published by the W3C I can't quite figure out. This is a case study at most, not a specification of anything in particular.
The W3C Math Working Group has posted the third public working draft of Mathematical Markup Language (MathML) Version 3.0. Changes since 2.0 include content dictionaries, "a mechanism for recording that a particular notational structure has a particular mathematical meaning". Version 3.0 is also supposed to enable easier markup of elementary school mathematics.
The Modis Team has released Sedna 3.0, an open source native XML database for Windows and Linux written in C++ and Scheme and published under the Apache License 2.0. Sedna supports XQuery and its own declarative update language. This release fixes bugs and improves transaction support.
Of the open source XML databases, this is the one I know the least about. Anyone want to comment on this one?
The W3C XHTML 2 Working Group has posted the third public working draft of
CURIE Syntax 1.0:
A syntax for expressing Compact URIs. This is modeled after namespace URIs and qualified names. In brief, it defines a prefix for a known base IRI (a URI that can contain non-ASCII characters like é),
then appends a colon and a local part.
For example, the CURIE cafe:tradeshows.xml could be shorthand for
http://www.cafeaulait.org/tradeshows.xml if the prefix
cafe were mapped to the URL
http://www.cafeaulait.org/.
Exactly how prefixes are mapped to base IRIs is left to the specification of the documents in which the CURIEs appear. However
if the CURIEs are in an XML document, then the namespaces in scope define the
prefix mappings. The default namespace can be used for prefix-less CURIEs.
Frankly I'm surprised to see this. Namespaces and the namespace syntax are one of the notable failures of the XML ecosystem. Why someone would choose to imitate this now that we know better is beyond me. Based on experience with namespaces, I predict that the problems of moving CURIEs from one context to another are going to be especially problematic. Well, we've learned to live with (if not exactly like) namespaces. I guess we can get used to this.
The Unicode Consortium has released Unicode 5.1:
This release contains over 100,000 characters, and provides significant additions and improvements that extend text processing for software worldwide. Some of the key features are: increased security in data exchange, significant character additions for Indic and South East Asian scripts, expanded identifier specifications for Indic and Arabic scripts, improvements in the processing of Tamil and other Indic scripts, linebreaking conformance relaxation for HTML and other protocols, strengthened normalization stability, new case pair stability, plus others given below.
The Version 5.1.0 data files and documentation are final and posted on the Unicode site. In addition to updated existing files, implementers will find new test data files (for example, for linebreaking) and new XML data files that encapsulate all of the Unicode character properties. For details, see the page for Unicode 5.1.0 at http://www.unicode.org/versions/Unicode5.1.0/.
A major feature of Unicode 5.1.0 is the enabling of ideographic variation sequences. These sequences allow standardized representation of glyphic variants needed for Japanese, Chinese, and Korean text. The first registered collection, from Adobe Systems, is now available at http://www.unicode.org/ivd/.
Unicode 5.1 contains significant changes to properties and behaviorial specifications. Several important property definitions were extended, improving linebreaking for Polish and Portuguese hyphenation. The Unicode Text Segmentation Algorithms, covering sentences, words, and characters, were greatly enhanced to improve the processing of Tamil and other Indic languages. The Unicode Normalization Algorithm now defines stabilized strings and provides guidelines for buffering. Standardized named sequences are added for Lithuanian, and provisional named sequences for Tamil.
Unicode 5.1.0 adds 1,624 newly encoded characters. These additions include characters required for Malayalam and Myanmar and important individual characters such as Latin capital sharp s for German. Version 5.1 extends support for languages in Africa, India, Indonesia, Myanmar, and Vietnam, with the addition of the Cham, Lepcha, Ol Chiki, Rejang, Saurashtra, Sundanese, and Vai scripts. Scholarly support includes important editorial punctuation marks, as well as the Carian, Lycian, and Lydian scripts, and the Phaistos disc symbols. Other new symbol sets include dominoes, Mahjong, dictionary punctuation marks, and math additions. This latest version of the Unicode Standard has exactly the same character assignments as ISO/IEC 10646:2003 plus Amendments 1 through 4.
The Unicode Collation Algorithm (UCA), the core standard for sorting all text, is also being updated at the same time (see http://www.unicode.org/reports/tr10/). The major changes in UCA include coverage of all Unicode 5.1 characters, tightened conformance for canonical equivalence, clearer definitions of internationalized search and matching, specifications of parameters for customizing collation, and definitions of collation folding. There are also important clarifications on the use of contractions (such as "ch" in Slovak) in collation.
The next version of the Unicode locale project (CLDR) is also being prepared on the basis of Unicode 5.1, and is now open for public data submission (see http://www.unicode.org/cldr/).
The W3C Web Security Context Working Group has posted the an updated public working draft of Web Security Context: Experience, Indicators, and Trust.
This specification deals with the trust decisions that users must make online, and with ways to support them in making safe and informed decisions where possible.
In order to achieve that goal, this specification includes recommendations on the presentation of identity information by Web user agents; on handling errors in security protocols in a way that minimizes the trust decisions left to users, and (we hope) induces them toward safe behavior where they have to make these decisions; and on data entry interactions that (we hope, again) will make it easier for users to enter sensitive data into legitimate sites than to enter them into illegitimate sites.
Where this document specifies user interactions with a goal toward making security usable, no claim is made at this time that this goal is met: As noted in the Status of this Document section, this is an initial draft to trigger discussion and commentary; assume that what is proposed here is untested.
To complement the interaction and decision related parts of this specification, 7 Robustness addresses the question of how the communication of context information needed to make decisions can be made more robust against attacks.
Finally, 8 Authoring and deployment best practices is about practices for those who deploy Web Sites. It complements some of the interaction related techniques recommended in this specification. The aim of this section is to provide guidelines for creating Web sites with reduced attack surfaces against certain threats, and with usefully provided security context information.
This specification comes with two companion documents: [WSC-USECASES] documents the use cases and assumptions that underly this specification. [WSC-THREATS] documents the Working Group's threat analysis.
The W3C XML Core Working Group has a new last call working draft of the XML Linking Language (XLink) Version 1.1. There are three major changes in XLink 1.1 compared to 1.0:
xlink:type="simple" attribute is no longer required.That is a simple link can now be written like this:
<composer xlink:href="http://www.beand.com/">Beth Anderson</composer>
It's no longer necessary to write this:
<composer xlink:type="simple" xlink:href="http://www.beand.com/">Beth Anderson</composer>
This is a good thing. I'm not sure who first came up with this idea, but I've been advocating it for a while now. This makes XLink a lot more palatable in applications like SVG.
It's not immediately clear what changes necessitated going back from the previous candidate recommendation to a last call status again.
The Mozilla Project has posted the fifth beta of Firefox 3.0 for Mac, Linux, and Windows. "Firefox 3 is based on the Gecko 1.9 Web rendering platform, which has been under development for the past 32 months. Building on the previous release, Gecko 1.9 has more than 12,000 updates including some major re-architecting to provide improved performance, stability, rendering correctness, and code simplification and sustainability. Firefox 3 has been built on top of this new platform resulting in a more secure, easier to use, more personal product with a lot more under the hood to offer website and Firefox add-on developers. [Improved in Beta 5!] Firefox 3 Beta 5 includes more than 750 changes from the previous beta, improving stability and web compatibility, providing platform and user interface enhancements, and resulting in the fastest Firefox ever. Many of these improvements were based on community feedback from the previous beta."
The W3C XML Security Specifications Maintenance Working Group
has posted the Proposed Edited Recommendation of
XML Signature Syntax and Processing (Second Edition)
"This Proposed Second Edition of XML Signature Syntax and
Processing adds Canonical XML 1.1 as a required
canonicalization algorithm and recommends its use for inclusive
canonicalization. This version of Canonical XML enables use of
xml:id and xml:base Recommendations
with XML Signature and also enables other possible future
attributes in the XML namespace. Additional minor changes,
including the incorporation of known errata, are documented in
Changes in XML Signature Syntax and Processing
(Second Edition)." I have to read through the detailed changes, but at first glance this looks like a reasonable adjustment that doesn't break any existing code.
The W3C XSL Working Group has published the requirements for the XSL Formatting Objects 2.0. "A number of XSL 1.0 implementations already support dynamic inclusion of vector graphics using W3C SVG. The XSL and SVG WGs want to define a tighter interface between XSL-FO and SVG to provide enhanced functionality. Experiments with the use of SVG paths to create non-rectangular text regions, or 'run-arounds', have helped to motivate further work on deeper integration of SVG graphics inside XSL-FO documents, and to work with the SVG WG on specifying the meaning of XSL-FO markup inside SVG graphics. A similar level of integration with MathML is contemplated."
Cambridge University's Toby O. H. White has released FoX, an open source, validating XML parser written in Fortran 95. It includes both SAX-like push and DOM interfaces. FoX is published under a BSD license.
The OpenOffice Project has released OpenOffice 2.4, an open source office suite for Linux, Solaris, and Windows that saves all its files as zipped XML. New features in 2.4 include:
- Connect to WebDAV servers via HTTPS
- Custom icons for toolbars are imported
- Control password-storing with a master password
- Warning if document is from a newer ODF
- PDF documents: relative links, document references, PDF/A-1 (ISO 19005-1) supported, and cross-document link behavior options
- Mac OS X: Quicktime support for movies and sound / use the built in spell checker
- Print dialog improvements in usability
- Edit boxes: warning at limit of characters
- DejaVu font is now default instead of BitStream Vera
Localisation
- Entries for 10 languages added
Base / DBA
- Improved rendering of numeric(n) data from JDBC and Oracle
- Easier choice of table name in "Copy table"
- Editing of views in HSQLDB
- Query designer for all properties which allow SQL command
- Query designer in SQL view
- Relation design accessible for MySQL databases
- Setting to check for required fields on forms
- Support for Access 2007 (.accdb files)
Calc
- Convert text to columns: with this feature CSV data inside cells can be transformed into columns directly
- Columns and rows in spreadsheet can be moved with drag and drop
- Enter key returns to the column where the input started, one row below
- Formula input: "+" and "-" can also be used to start
- Individual zoom level per sheet
- AutoFilter: choices clearer grouped and based on result of filtering in other columns
- DataPilot: Manual Sorting / Double-click in DataPilot cell provides calculation data of that cell
- Performance improvement with functions VLOOKUP and MATCH
- Print dialog for Calc easier to use
- PageUp and PageDown keys work in print preview
- Sheet names in cell-hyperlinks: renamed properly
Chart
- Regression curves: show equations and R² value
- Reverse axes possible
- Bars on different axes displayed next to each other
- Data labels: Number format
- Data point label: display both value and percentage
- Data label: display each part in a separate line
- Data labels: more flexible placement of labels
- Labels on pie segments: avoiding overlapping
- Data point label: can be removed with delete key
Draw
- Navigation (tab) order of page objects
- PDF export: page names as bookmark
- Reduce complexity: no longer necessary display options removed
Impress
- Navigation (tab) order of page objects
- Thrilling 3D effects in slide transitions
- Export slide names as PDF bookmarks
- Easier to insert background picture
Writer
- Selecting rectangular region of text
- Find and Replace: backward references in regular expressions
- Spell checking: easier selecting of the language
- Insert&Insert Object toolbar redesign - Writer
- Printing of hidden text can be turned on
- Printing text place holders can be turned off
- Shortcuts added for paragraph style Heading 4, Heading 5 and Textbody
- Ctrl-click behaviour for hyperlinks can be changed
- Custom document properties: Text fields and UI support
Extensions/ programmability / API
- Extensible Help System for extensions
- Extensions can have a separate display name
- Extensions: support of web based update
- Extensions: additional information about the publisher and release notes
- Extensions: check for updates
- Dialogs can have a wallpaper set
- Transparent background for controls
- Remote control presentations via API
- API: get selected table(s) or query(s) in the main Base window
The Mozilla Project has released Firefox 2.0.0.13. This release fixes a number of security. All users should upgrade.
A new version of SeaMonkey has also been posted, though Camino doesn't seem to have been updated yet. Camino users may want to switch to Firefox or Safari for the time being.
The W3C Semantic Web Best Practices and Deployment Working Group and HTML Working Groups have published a new working draft of RDFa Primer 1.0.
Current Web pages, written in XHTML, contain inherent structured data: calendar events, contact information, photo captions, song titles, copyright licensing information, etc. When authors and publishers can express this data precisely, and when tools can read it robustly, a new world of user functionality becomes available, letting users transfer structured data between applications and Web sites. An event on a Web page can be directly imported into a desktop calendar. A license on a document can be detected to inform the user of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published as easily as the original photo itself.
RDFa lets XHTML authors express this structured data using existing XHTML attributes and a handful of new ones. Where data, such as a photo caption, is already present on the page for human readers, the author need not repeat it for automated processes to access it. A Web publisher can easily reuse data fields, e.g. an event's date, defined by other publishers, or create new ones altogether. RDFa gets its expressive power from RDF [RDFPRIMER], though the reader need not understand RDF before reading this document.
For simplicity, instead of using RDF terminology, we use the word "field" to indicate a unit of labeled information, e.g. the "first name" field indicates a person's first name.
RDFa uses Compact URIs, which express a URI using a prefix, e.g.
dc:titlewheredc:stands forhttp://purl.org/dc/elements/1.1/. In this document, for simplicity's sake, the following prefixes are assumed to be already declared:dcfor Dublin Core [DC],foaffor Friend-Of-A-Friend [FOAF],ccfor Creative Commons [CC], andxsdfor XML Schema Definitions [XSD]:
dc: http://purl.org/dc/elements/1.1/foaf: http://xmlns.com/foaf/0.1/cc: http://creativecommons.org/ns#xsd: http://www.w3.org/2001/XMLSchema#We use standard XHTML notation for elements and attributes: both are denoted using fixed-width lowercase font, e.g.
div, and attributes are differentiated using a preceding '@' character, e.g.@href.
Here's a syntax example from the draft:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"
xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#">
<head>
<title>Jo's Friends and Family Blog</title>
</head>
<body>
...
<p instanceof="cal:Vevent">
I'm holding
<span property="cal:summary">
one last summer Barbecue,
</span>
on
<span property="cal:dtstart" content="20070916T1600-0500">
September 16th at 4pm.
</span>
</p>
...
<p class="contactinfo" about="http://example.org/staff/jo">
<span property="contact:fn">Jo Smith</span>.
<span property="contact:title">Web hacker</span>
at
<a rel="contact:org" href="http://example.org">
Example.org
</a>.
You can contact me
<a rel="contact:email" href="mailto:jo@example.org">
via email
</a>.
</p>
...
</body>
</html>
The thing that jumps out at me are the use of namespace prefixes in attribute values. Haven't we learned by now that this is a bad idea?
The W3C has published the second working draft of Cool URIs for the Semantic Web:
The Semantic Web is envisioned as a decentralised world-wide information space for sharing machine-readable data with a minimum of integration costs. Its two core challenges are the distributed modelling of the world with a shared data model, and the infrastructure where data and schemas can be published, found and used. Users benefit from getting information "raw and now" [Give] and in portable data formats [DP]. Providers often publish data embedded in a fixed user interface, in HTML. A basic question is thus how to publish information about resources in a way that allows interested users and software applications to find and interpret them.
On the Semantic Web, all information has to be expressed as statements about resources, like the members of the company Example.com are Alice and Bob or Bob's telephone number is "+1 555 262 or this Web page was created by Alice. Resources are identified by Uniform Resource Identifiers (URIs) [RFC3986]. This modelling approach is at the heart of Resource Description Framework (RDF) [RDFPrimer]. A nice introduction is given in the N3 primer [N3Primer].
Using RDF, the statements can be published on the Web site of the company. Others can read the data and publish their own information, linking to existing resources. This forms a distributed model of the world. It allows the user to pick any application to view and work with the same data, for example to see Alice's published address in your address book.
At the same time, Web documents have always been addressed with URIs (in common parlance often referred as Uniform Resource Locators, URLs). This is useful because it means we can easily make RDF statements about Web pages, but also dangerous because we can easily mix up Web pages and the things, or resources, described on the page.
So the question is, what URIs should we use in RDF? As an example, to identify the frontpage of the Web site of Example Inc., we may use http://www.example.com/. But what URI identifies the company as an organisation, not a Web site? Do we have to serve any content—HTML pages, RDF files—at those URIs? In this document we will answer these questions according to relevant specifications. We explain how to use URIs for things that are not Web pages, such as people, products, places, ideas and concepts such as ontology classes. We give detailed examples how the Semantic Web can (and should) be realised as a part of the Web.
Oracle's John Snelson has posted a beta of Faxpp, an
open source
XML pull parser written in C with an API that can return
UTF-8 or UTF-16 strings. Faxpp is published under the Apache License v2.
The W3C has published a proposed edited recommendation of XML Base (Second Edition). Changes since the first edition include:
The published errata (see http://www.w3.org/2001/06/xmlbase-errata) have been incorporated;
The definition of URI reference has been switched from RFC2396 to 3986;
The xml:base attribute has been redescribed as a Legacy Extended IRI, but this does not change its syntax (the December 2006 PER used the term "XML Resource Identifier" which was to be defined in an XLink revision, but that plan has been superseded by the definition of LEIRI in RFC 3987 bis);
Implementations are now encouraged to return base “URIs” without escaping non-URI characters;
The meanings of xml:base="" and xml:base="#frag" have been clarified;
The expected reference to XML Base in the forthcoming XML Media Types RFC (“son of 3023”) has been noted;
It has been clarified that normal validity rules apply to the xml:base attribute;
The out-of-date appendix describing effects on other standards has been removed;
Apple has released Safari 3.1 for Mac and Windows. This release speeds up JavaScript and plugs some security holes. New features include:
video and audio elementsimg element and CSS now support SVG images (Is inline SVG supported? I'll have to check. Yep, looks like it works but only if Safari recognizes the document as XHTML, not HTML. Firefox behaves similarly. )The W3C Working Group has published a new working draft of Protocol for Web Description Resources (POWDER): Description Resources.
The Protocol for Web Description Resources (POWDER) facilitates the publication of descriptions of multiple resources such as all those available from a Web site. These descriptions are always attributed to a named individual, organization or entity that may or may not be the creator of the described resources. This contrasts with more usual metadata that typically applies to a single resource, such as a specific document's title, which is usually provided by its author.
This document sets out how Description Resources (DRs) can be created and published, whether individually or as bulk data, how to link to DRs from other online resources, and, crucially, how DRs may be authenticated and trusted. The aim is to provide a platform through which opinions, claims and assertions about online resources can be expressed by people and exchanged by machines. POWDER has evolved from the data model developed for the final report [XGR] of the Web Content Label Incubator Group [WCL-XG] from which we define a Description Resource as: "a resource that contains a description, a definition of the scope of the description and assertions about both the circumstances of its own creation and the entity that created it."
The method of defining the scope of a DR, that is, defining what is being described, is provided in a separate document: Grouping of Resources [GROUP]. Companion documents describe the RDF/OWL vocabulary [VOC] and XML data types [WDRD] that are derived from the Grouping of Resources document and this document, with each term's domain, range and constraints defined. As each term is introduced in this document, it is linked to its description in the vocabulary document.
The W3C XQuery working group has posted the candidate recommendations of XQuery Update Facility, XQuery Update Facility Use Cases, and XQuery Update Facility 1.0 Requirements. XQuery as it currently exists is basically just SELECT in SQL terms. XQuery Update adds INSERT, UPDATE, and DELETE. More specifically it is:
upd:mergeUpdatesupd:revalidateupd:applyUpdatesupd:insertBeforeupd:insertAfterupd:insertIntoupd:insertIntoAsFirstupd:insertIntoAsLastupd:insertAttributesupd:deleteupd:replaceNodeupd:replaceValueupd:replaceElementContentupd:renameupd:removeTypeupd:setToUntypedThis is one of the last two pieces before XQuery 1.0 is really complete. (The other is full-text search.)
The Helsinki University of Technology has released X-Smiles 1.2, a proof-of-concept XForms engine written in Java. Version 1.2 improves support for XBL 2 bindings.
The W3C Authoring Tool Accessibility Guidelines Working Group has posted new working drafts of Authoring Tool Accessibility Guidelines 2.0 and Implementation Techniques for Authoring Tool Accessibility Guidelines 2.0. "An authoring tool that conforms to these guidelines will promote accessibility by providing an accessible user interface to authors with disabilities as well as enabling, supporting, and promoting the production of accessible Web content by all authors." and
The W3C Web API Working Group has published the last call working draft of ElementTraversal Specification. "This specification defines the ElementTraversal interface, which allows script navigation of the elements of a DOM tree, excluding all other nodes in the DOM, such as text nodes. It also provides a property to expose the number of child elements of an element. It is intended to provide a more convenient alternative to existing DOM navigation interfaces, with a low implementation footprint." Hmm, just what the DOM needs: yet another way to do it.
ElementTraversal provides some extra properties/methods for navigating only through elements, while ignoring text and white space:
firstElementChildlastElementChildpreviousElementSiblingnextElementSiblingchildElementCountThis makes it easier to process record-like XML, but inappropriate for reading documents with mixed content.
The Mozilla Project has posted the fourth beta of Firefox 3.0 for Mac, Linux, and Windows. This is code named "Gran Paradiso". "Firefox 3 is based on the new Gecko 1.9 Web rendering platform, which has been under development for the past 28 months and includes nearly 2 million lines of code changes, fixing more than 11,000 issues. Gecko 1.9 includes some major re-architecting for performance, stability, correctness, and code simplification and sustainability. Firefox 3 has been built on top of this new platform resulting in a more secure, easier to use, more personal product with a lot under the hood to offer website and Firefox add-on developers. [Improved in Beta 4!] Firefox 3 Beta 4 includes more than 900 enhancements from the previous beta, including drastic improvements to performance and memory usage, as well as fixes for stability, platform enhancements and user interface improvements. Many of these improvements were based on community feedback from the previous beta."
Sun has posted version 0.5.5 of xmlroff, an open source XSL Formatting Objects to PDF and PostScript converter. (Web site not yet updated though.)elharo xmlroff is written in C for Linux, and relies on the libxml2, libxslt, and the GLib, and GObjectfrom GTK+ and GNOME (though neither GTK+ nor Gnome is required). It also needs PDFlib, FreeType2, and Fontconfig. xmlroff can be run from the command line. It also includes a libfo library. This version improves table rendering.
I've posted the updated notes from today's XForms talk at SD 2008 West. I suspect I'll be retiring this one after this week. There seems to be very limited interest, and the software is just not making fast enough progress. I last gave this talk three years ago, and the progress since then has been glacial. The action's all in AJAX and, maybe, HTML 5. Waiting for third parties to finish specs and software just doesn't work in Internet time.
Microsoft has posted the first public beta of Internet Explorer 8 for Windows:
Beta 1 is a developer preview for web designers and developers to help prepare their websites for the launch of Internet Explorer 8. Some of the new features designed for developers include a developer toolbar and improved interoperability and compatibility.
Internet Explorer 8 is designed to work in standard mode out of this box. However, Microsoft provides a way for users to browse the web in a way similar to Internet Explorer 7 by using the emulate Internet Explorer 7 button on the chrome.
Updates have been and likely will continue to be a little slow this week since I'm busy at SD 2008 West. However I have posted the notes from my first two sessions, RSS, Atom, APP, and All That and Native XML Databases.
The W3C has posted the first public working draft of SKOS Simple Knowledge Organization System Primer:
SKOS — Simple Knowledge Organisation System — provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other types of controlled vocabulary. As an application of the Resource Description Framework (RDF) SKOS allows concepts to be documented, linked and merged with other data, while still being composed, integrated and published on the World Wide Web.
This document is an implementors guide for those who would like to represent their concept scheme using SKOS.
In basic SKOS, conceptual resources (concepts) can be identified using URIs, labelled with strings in one or more natural languages, documented with various types of notes, semantically related to each other in informal hierarchies and association networks, and aggregated into distinct concept schemes.
In advanced SKOS, conceptual resources can be mapped to conceptual resources in other schemes and grouped into labelled or ordered collections. Concept labels can also be related to each other. Finally, the SKOS vocabulary itself can be extended to suit the needs of particular communities of practice.
This document is a companion to the SKOS Reference, which gives the normative reference on SKOS.
The W3C Cascading Style Sheets Working Group has posted the first public working draft of CSSOM View Module. "The APIs introduced by this specification provide authors with a way to inspect and manipulate the view information of a document. This includes getting the position of element layout boxes, obtaining the width of the viewport through script, and also scrolling an element."
The W3C Web API Working Group has posted the first public working draft of XMLHttpRequest Level 2. "XMLHttpRequest Level 2 enhances XMLHttpRequest with new features, such as cross-site requests, progress events, and the handling of byte streams for both sending and receiving." I'm afraid I'm not familiar enough with XMLHttpRequest Level 1 to tell immediately what's new here. Anyone want to summarize?
Addison-Wesley is looking for a few kind folks to contribute cover blurbs for Refactoring HTML, and possibly a forward. If you're interested, drop me a line, and I'll pass your info along to my editors so you can get a preview copy of the book.
XimpleWare has released VTD-XML 2.3, a free (GPL) non-extractive Java/C/C# library for processing XML that supports XPath. This appears to be an example of what Sam Wilmot calls "in situ parsing". In other words, rather than creating objects representing the content of an XML document, VTD-XML just passes pointers into the actual, real XML. (These are the abstract pointers of your data structures textbook, not C-style addresses in memory. In this cases the pointers are int indexes into the file.) You don't even need to hold the document in memory. It can remain on disk. This should improve speed and memory usage, but I haven't verified that, and I don't trust their own benchmarks. Version 2.3 fixes bugs, adds more encodings, and can dump an in-memory copy of the text. However it's still not a minimally conformant XML parser, and doesn't seem likely to become one. That severely reduced my interest.
A "Rough Cut" version of Refactoring HTML is now available on Safari. For some reason, my Safari account doesn't allow me to login and read this, so I'm not sure exactly which version is there. (I just finished reviewing the copy edits this past week.) Online access is $27.99. If you also want the printed book shipped to you when it's released--hopefully in time for JavaOne in May--the combined price is $53.98. Or you can pre-order the printed book from Amazon for $39.99.
The W3C Semantic Web Deployment Working Group and XHTML 2 Working Group have posted the last call working draft of RDFa in XHTML: Syntax and Processing.
The modern Web is made up of an enormous number of documents that have been created using HTML. These documents contain significant amounts of structured data, which is largely unavailable to tools and applications. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites, and allowing browsing applications to improve the user experience: an event on a web page can be directly imported into a user's desktop calendar; a license on a document can be detected so that users can be informed of their rights automatically; a photo's creator, camera setting information, resolution, location and topic can be published as easily as the original photo itself, enabling structured search and sharing.
RDFa is a specification for attributes to be used with languages such as HTML and XHTML to express structured data. The rendered, hypertext data of XHTML is reused by the RDFa markup, so that publishers don't need to repeat significant data in the document content. This document only specifies the use of the RDFa attributes with XHTML. The underlying abstract representation is RDF [RDF-PRIMER], which lets publishers build their own vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. The expressed structure is closely tied to the data, so that rendered data can be copied and pasted along with its relevant structure.
The rules for interpreting the data are generic, so that there is no need for different rules for different formats; this allows authors and publishers of data to define their own formats without having to update software, register formats via a central authority, or worry that two formats may interfere with each other.
RDFa shares some use cases with microformats. Whereas microformats specify both a syntax for embedding structured data into HTML documents and a vocabulary of specific terms for each microformat, RDFa specifies only a syntax and relies on independent specification of terms (often called vocabularies or taxonomies) by others. RDFa allows terms from multiple independently-developed vocabularies to be freely intermixed and is designed such that the language can be parsed without knowledge of the specific term vocabulary being used.
This document is a detailed syntax specification for RDFa, aimed at:
- those looking to create an RDFa parser, and who therefore need a detailed description of the parsing rules;
- those looking to recommend the use of RDFa within their organisation, and who would like to create some guidelines for their users;
- anyone familiar with RDF, and who wants to understand more about what is happening 'under the hood', when an RDFa parser runs.
I think I'm just about ready to declare this a dead technology. The problem is that explicit, semantic markup is simply of little to no interest to the vast majority of content creators. Not enough people are willing to put in the extra effort to identify the relevant parts. The only way to find licenses on a document, events on a web page, and so forth is to make the computers clever enough to recognize such things from the plain text content and associated formatting.
If RDF, XML, Schemas, and everything else we've been working on for the last ten years hasn't succeeded in breaking the WYSIWYG barrier yet, why do we think they're suddenly going to do so now? Sooner or later, it's time to admit that the enterprise is fundamentally misguided. For RDF and the semantic web that time has come.
The W3C Voice Browser, Web APIs, and Web Application Formats (WAF) Working Groups have posted a new draft of Access Control for Cross-site Requests (formerly "Enabling Read Access for Web Resources" and "Authorizing Read Access to XML Content Using the <?access-control?> Processing Instruction 1.0"). According to the draft,
Cross-site requests are possible using the HTML
imgandscriptelements for instance. However, it is not possible to exchange the contents of resources or manipulate resources "cross-domain". This is to prevent information leakage and to ensure that malicious site can not delete your calendar data with cross-site requests using the HTTPDELETEmethod.The policy this document introduces allows a resource to opt-in to allowing cross-site data retrieval of it and also enables a mechanism based on the same policy to allow a resource to opt-in to requests using an HTTP method other than
GET. This policy builds on top of the existing restrictions already in place. This policy described in this document can only be used by a technology, such asXMLHttpRequestor XBL, when the respective specification of that technology describes how it applies.The access control policy is defined in the resource that might be obtained and is expected to be enforced by the client that retrieves and processes the resource. Thus the client is trusted and acts as a policy enforcement point.
If you have a simple text resource residing at
http://example.com/hellowhich contains the string "Hello World!" and you would like thehello-world.invaliddomain to be able to access it the resource would look as follows (including one HTTP header that is significant):Access-Control: allow <hello-world.invalid> Hello World!The
hello-world.invalidcan now access this document usingXMLHttpRequestfor instance with the following code:new client = new XMLHttpRequest(); client.open("GET", "http://example.com.com/hello") client.onreadystatechange = function() { /* do something */ } client.send()
I've had this one explained to me repeatedly, and I still don't understand exactly what's going on here or why it isn't a security hole, but I guess there's a use case for it.
Eve Maler and Jeanne El Andaloussi have published Developing SGML DTDs: From Text to Model to Markup on the Web. A tad dated, but there's still a lot of good stuff here.
The W3C Internationalization Tag Set Working Group has posted the finished note on Best Practices for XML Internationalization. "This document provides a set of guidelines for developing XML documents and schemas that are internationalized properly. Following the best practices describes here allow both the developer of XML applications, as well as the author of XML content to create material in different languages."
The W3C CSS working group has posted the last call working draft of CSS Module: Namespaces. This module "defines the syntax for using namespaces in CSS. It defines the @namespace rule for declaring the default namespace and binding namespaces to namespace prefixes, and it also defines a syntax that other specifications can adopt for using those prefixes in namespace-qualified names. ."
Given the namespace declarations:
@namespace toto "http://toto.example.org"; @namespace "http://example.com/foo";In a context where the default namespace applies
toto|A- represents the name
Ain thehttp://toto.example.orgnamespace.|B- represents the name
Bthat belongs to no namespace.*|C- represents the name
Cin any namespace, including no namespace.D- represents the name
Din thehttp://example.com/foonamespace.
Sun has posted version 0.5.4 of xmlroff, an open source XSL Formatting Objects to PDF and PostScript converter. xmlroff is written in C for Linux, and relies on the libxml2, libxslt, and the GLib, and GObjectfrom GTK+ and GNOME (though neither GTK+ nor Gnome is required). It also needs PDFlib, FreeType2, and Fontconfig. xmlroff can be run from the command line. It also includes a libfo library. This version fixes bugs.
IBM developerWorks has published my look ahead at The future of XML: How will you use XML in years to come?.
The Mozilla Project has posted the third beta of Firefox 3.0 for Mac, Linux, and Windows. This is code named "Gran Paradiso". "Firefox 3 is based on the new Gecko 1.9 Web rendering platform, which has been under development for the past 28 months and includes nearly 2 million lines of code changes, fixing more than 11,000 issues. Gecko 1.9 includes some major re-architecting for performance, stability, correctness, and code simplification and sustainability. Firefox 3 has been built on top of this new platform resulting in a more secure, easier to use, more personal product with a lot under the hood to offer website and Firefox add-on developers. [Improved in Beta 3!] Firefox 3 Beta 3 includes approximately 1300 individual changes from the previous beta, including fixes for stability, performance, memory usage, platform enhancements and user interface improvements. Many of these improvements were based on community feedback from the previous beta." I recommend skipping this release unless you need to test your own site. It's been breaking some Web-2.0ish sites.
The Mozilla Project has released Firefox 2.0.0.12. "This release fixes a number of security and stability issues discovered in Firefox 2.0.0.12." All users should upgrade.
New versions of SeaMonkey and Camino with these fixes have also been posted.
The W3C Core Working group has published a proposed edited recommendation of XML 1.0, fifth edition. "This fifth edition is not a new version of XML. As a convenience to readers, it incorporates the changes dictated by the accumulated errata (available at http://www.w3.org/XML/xml-V10-4e-errata) to the Fourth Edition of XML 1.0, dated 16 August 2006. In particular, erratum [E09] relaxes the restrictions on element and attribute names, thereby providing in XML 1.0 the major end user benefit currently achievable only by using XML 1.1."
Hmm, this certainly looks like a new version of XML to me. The BNF has changed and previously malformed documents are suddenly well-formed. Existing parsers cannot handle the syntax defined by this draft. XML 1.1 has failed so now the W3C is trying to rewrite history and pretend that this is what they meant all along. (If that were true, why did we waste so much time on XML 1.1?) Apparently stability of standards is no longer a virtue at the W3C. This proposed edit is unnecessary and actively harmful to the community. It should be rejected.
Code Synthesis has released XSD 3.1.0, a free-as-in-speech (GPL) C++ W3C XML Schema to C++ data binding library. New features in this release include support for xsi:type and substitution groups.
The Mozilla Project has posted version 0.8.4 of its XForms extension for Firefox. Mozilla XForms support has been developed by IBM, Novell, and independent contributors. It's not a complete XForms 1.0 or 1.1 implementation yet, but it's getting there.
Another day, another WordPress security bug. Matt Mullenweg has released Wordpress 2.3.3 an open source (GPL) blog engine based on PHP and MySQL. "If you have registration enabled a flaw was found in the XML-RPC implementation such that a specially crafted request would allow a user to edit posts of other users on that blog. In addition to fixing this security flaw, 2.3.3 fixes a few minor bugs." All users should upgrade.
The W3C Semantic Web Deployment Working Group has published a new draft of Best Practice Recipes for Publishing RDF Vocabularies. "This document describes best practice recipes for publishing vocabularies or ontologies on the Web (in RDF Schema or OWL). The features of each recipe are described in details, so that vocabulary designers may choose the recipe best suited to their needs. Each recipe introduces general principles and an example configuration for use with an Apache HTTP server (which may be adapted to other environments). The recipes are all designed to be consistent with the architecture of the Web as currently specified." This contains six recipes:
The W3C XML Core Working Group has published the proposed recommendation
Canonical XML 1.1. This
attempts to address some of the weirdnesses of
Canonical XML, such as the movement of xml:id attributes from one element to another and breaking of base URLs when canonicalizing.
According to XiTi Monitor, Internet Explorer's share of the browser market has dropped to 66.1% and Firefox has risen to 28%. Opera and Safari trail behind with 3.3% and 2% respectively. Firefox seems to be catching on faster in Europe than the U.S. and Asia.
However one has to be a little skeptical of these numbers since XiTi doesn't seem to provide any indication of the uncertainty in their figures. It's hard to believe they can really be accurate to ±0.1% in all these different countries.
The W3C Synchronized Multimedia Working Group has published what is both the first public and last call working draft of SMIL Timesheets 1.0. "This document defines an XML timing language that makes SMIL 3.0 element and attribute timing control available to a wide range of other XML languages. This language allows SMIL timing to be integrated into a wide variety of a-temporal languages, even when several such languages are combined in a compound document. Because of its similarity with external style and positioning descriptions in the Cascading Style Sheet (CSS) language, this functionality has been termed SMIL Timesheets."
This was formerly part of the SMIL 3.0 spec so making the same document both first an last draft is not as strange as it seems. Comments are due by February 15.
RDF, OWL, SPARQL, and now SKOS, the Simple Knowledge Organization System Reference. "Using SKOS, conceptual resources can be identified using URIs, labeled with lexical strings in one or more natural languages, documented with various types of note, linked to each other and organized into informal hierarchies and association networks, aggregated into concept schemes, and mapped to conceptual resources in other schemes. In addition, labels can be related to each other, and conceptual resources can be grouped into labeled and/or ordered collections." How many of these things do we need before the Semantic Web is here? I think Clay Shirky was right: it really is turtles all the way up. The Semantic Web is like an undergraduate paper: never really completed, just abandoned at the point of exhaustion.
The W3C has posted the first three working drafts covering OWL 1.1:
The W3C RDF Data Access Working Group has published the finished recommendations of SPARQL Query Results XML Format, SPARQL Protocol for RDF, and SPARQL Query Language for RDF. According to the latter, "RDF is a directed, labeled graph data format for representing information in the Web. This specification defines the syntax and semantics of the SPARQL query language for RDF. SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware. SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions. SPARQL also supports extensible value testing and constraining queries by source RDF graph. The results of SPARQL queries can be results sets or RDF graphs."
The W3C HTML Working Group has posted the first public working draft of their version of HTML 5. I haven't had time to do a side-by-side compare, but at first glance this seems to be essentially the same as the current work in the WhatWG. Perhaps there's a little less focus on parsing models. There's also a nice summary of the differences from HTML 4. I noticed for the first time that the acronym element (which I actually use on this site) has been removed.
The W3C Synchronized Multimedia Working Group has posted the candidate recommendation of the Synchronized Multimedia Integration Language 3.0 (SMIL 3.0). SMIL 3.0 has four goals:
Michael Kay has released version 9.0.0.3 of Saxon, his XSLT 2.0 and XQuery processor for Java and .NET. This is a bug fix release. This is a "Maintenance Release clearing all known bugs up to 18 Jan 2008."
Saxon is published in two versions for both of which Java 1.4 or later (or .NET) is required. Saxon 9.0B is an open source product published under the Mozilla Public License 1.0 that "implements the 'basic' conformance level for XSLT 2.0 and XQuery." Saxon 9.0 SA is a £250.00 payware version that "allows stylesheets and queries to import an XML Schema, to validate input and output trees against a schema, and to select elements and attributes based on their schema-defined type. Saxon-SA also incorporates a free-standard XML Schema validator. In addition Saxon-SA incorporates some advanced extensions not available in the Saxon-B product. These include a try/catch capability for catching dynamic errors, improved error diagnostics, support for higher-order functions, and additional facilities in XQuery including support for grouping, advanced regular expression analysis, and formatting of dates and numbers."
Henry S. Thompson and Richard Tobin have released XSV 3.1, a partial W3C XML Schema Validator for Linux and Windows. There's also a web form based interface. This is a bug fix release. XSV is published under the GNU General Public License.
Microsoft has released Mac Office 2008 with support for the OfficeOpen XML document format. Other new features include a simplified toolbar; a toolbox for managing formatting, clip art, research, and bibliography palettes; and a publishing layout. However Visual Basic for Applications has been removed in this release. Anyone who needs that will have to stick with the previous release or use Windows. :-( Mac Office 2008 is about $131 depending on where you buy it.
Daniel Veillard has released version 2.6.31 of libxml2, the open source XML C library for Gnome. This release fixes assorted bugs including a serious security issue in UTF-8 handling. All users should upgrade.
XMLMind has released Qizx/db 2.0, a $3200 closed source, embeddable native XML database engine written in Java that supports XQuery 1.0.
The Free Software Foundation has released GNU IceCat (a.k.a. Gnuzilla), the GNU version of the Mozilla Firefox web browser.
While the basic Mozilla Firefox source code is free software, and we thank them for their significant contributions to the community, some non-free files are distributed in the Firefox source tree, and Firefox can recommend non-free plugins. IceCat is entirely free.
In addition, GNU IceCat includes some privacy protection features:
Some sites refer to zero-size images on other hosts to keep track of cookies. When IceCat detects this mechanism it blocks cookies from the site hosting the zero-length image file. (It is possible to re-enable such a site by removing it from the blocked hosts list.)
Other sites rewrite the host name in links redirecting the user to another site, mainly to "spy" on clicks. When this behavior is detected, IceCat shows a message alerting the user.
The W3C XHTML 2 working group has posted the first public working draft of
XHTML Access Module
Module to enable generic document accessibility. This module defines acess, an empty element that can carry
activate, key, targetid, and targetrole attributes.
activate attribute indicates whether a target element should be activated or not once it obtains focus. key attribute assigns a key mapping to an access shortcut. Triggering an access key defined in an access element changes focus to the next element in navigation order from the current focus that has one of the the referenced role or id values. targetid attribute specifies one or more IDREFs related to target elements for the associated event.targetrole attribute specifies a space separated list of CURIEs that maps to an element with a role attribute with the same value.The W3C Ubiquitous Web Applications Working Group has posted the first working draft of Delivery Context Ontology:
The Delivery Context Ontology provides a formal model of the characteristics of the environment in which devices interact with the Web. The delivery context includes the characteristics of the device, the software used to access the Web and the network providing the connection among others.
The delivery context is an important source of information that can be used to adapt materials from the Web to make them useable on a wide range of different devices with different capabilities.
The ontology is formally specified in the Web Ontology Language [OWL]. This document describes the ontology and gives details of each property that it contains.
The XML Apache Project has released Batik 1.7, an open source SVG display engine based on Java 2D. New features in 1.7 include xml:id, data URIs, DOM Level 3 ElementTraversal, an improved WMF transcoder, and a few SVG 1.2 features including handler elements.
DataDirect Technologies has released XML Converters 3.1, Java and .NET components that provide XML access (SAX, StAX, and DOM) to non-XML files including EDI, flat files and other legacy formats. Version 3.1 adds support for Standard Exchange Format (SEF) and Health Level Seven (HL7), "the health industry's standard for the exchange, management and integration of healthcare information to support patient care. DataDirect XML Converters version 3.1 for both Java and .NET include an implementation of the HL7 standard from the draft 2.1 to the current 2.5 release, across all messages and events". Pricing is roughly $1000 per format converted.
NewsGator has released version 3.1 of NetNewsWire, a closed source feed reader for the Mac. The big change in this release is that it's now free-as-in-beer. They've apparently discovered that selling customers' private reading details is a more profitable enterprise than selling software so they want to get it into as many hands as possible. Personally I prefer network clients that don't send my subscriptions and reading lists back to the mother ship.
Altsoft N.V. has released Xml2PDF 2007 1.2, a payware Windows program for converting XSL-FO, SVG, WordML, and XHTML documents into PDF files. New features in this release include Custom XML in Word 2007 source and a COM interface.
The DBIS Group at University of Konstanz has released BaseX 4.0, an open source native XML database with a GUI frontend that supports most of XQuery 1.0 and some of XQuery Full-Text. It seems to be written in java so one presumes its platform independent.
The Mac Mini that hosts xom.nu, The Cafes, and Mokka mit Schlag seems to have died. I'll bring them back as soon as I can, but it may take a couple of days.
John Cowan has released TagSoup 1.2, an open source, Java-language, SAX parser for nasty, ugly HTML. Version 1.2 changes the license to Apache 2.0. In addition,
Andy Clark has posted version 0.9.6 of his CyberNeko Tools HTML Parser for the Xerces Native Interface (NekoXNI). CyberNeko is written in Java. Besides the HTML parser, CyberNeko includes a generic XML pull parser, a DTD parser, a RELAX NG validator, and a DTD to XML converter. According to Clark
the implementation was updated to be compatible with the newest version of Xerces and the latest XNI API changes. And a number of outstanding bugs were fixed.
The only change that could affect users is that the minimum Java version required to run NekoHTML was increased to Java 1.3.
The W3C has published the first working draft of Cool URIs for the Semantic Web:
The Semantic Web is envisioned as a decentralised world-wide information space for sharing machine-readable data with a minimum of integration costs. Its two core challenges are the distributed modelling of the world with a shared data model, and the infrastructure where data and schemas can be published, found and used. A basic question is thus how to publish information about resources in a way that allows interested users and software applications to find them.
On the Semantic Web, all information has to be expressed as statements about resources, like the members of the company Example.com are Alice and Bob or Bob's telephone number is "+1 555 262 or this Web page was created by Alice. Resources are identified by Uniform Resource Identifiers (URIs) [RFC3986]. This modelling approach is at the heart of Resource Description Framework (RDF) [RDFPrimer].
Using RDF, the statements can be published on the website of the company. Others can read the data and publish their own information, linking to existing resources. This forms a distributed model of the world.
At the same time, Web documents have always been addressed with URIs (in common parlance often referred as Uniform Resource Locators, URLs). This is useful because it means we can easily make RDF statements about Web pages, but also dangerous because we can easily mix up Web pages and the things, or resources, described on the page.
So the question is, what URIs should we use in RDF? As an example, to identify the frontpage of the Web site of Example Inc., we may use http://www.example.com/. But what URI identifies the company as an organisation, not a Web site? Do we have to serve any content—HTML pages, RDF files—at those URIs? In this document we will answer these questions according to relevant specifications. We explain how to use URIs for things that are not Web pages, such as people, products, places, ideas and concepts such as ontology classes. We give detailed examples how the Semantic Web can (and should) be realised as a part of the Web.
The OpenOffice Project has released OpenOffice 2.3.1, an open source office suite for Linux, Solaris, and Windows that saves all its files as zipped XML. "This is a minor bug fix release with no new features for users. However, as this release also fixes a security vulnerability with database files, we recommend all affected users should upgrade to this release."
The Efficient XML Interchange (which is in fact none of those things) continues to roll along. The working groups has now published three new working drafts:
The best practice I can suggest for this is to ignore it. However the primer is probably the right place to start if you can't. Looking in the primer I almost immediately found yet another way in which EXI is not an alternative encoding of the XML infoset: it does not guarantee preservation of namespace prefixes. While one would have wished that namespace prefixes were insignificant, thats hip sailed long ago. The fact is, any XML processing has to preserve namespace prefixes faithfully or everything from DTDs to XSLT breaks. I don't think XML's namespace syntax is ideal by any means, but if you don't follow it you can't reliably claim to be representing XML.