XML 2.0
On a daily basis, I happily use a subset of XML which is quite simple, notwithstanding some little issues with namespaces and qualified names, which I can live with. This includes using XML for XHTML, XForms, XSLT, XPL, RSS, DocBook, custom document formats, and whatnots.
But while writing a short introduction to XML in the upcoming book about Web 2.0 we are writing with Alex Vernet, Eric van der Vlist, Danny Ayers and Joe Fawcett, it struck me again how complex XML actually is. You go back to the XML specification, and you realize that XML covers such monstrosities as (gasp) DTDs, entities and (an arguable gasp here) processing instructions. These features alone seem to eat up more than half of the XML specification (I haven’t counted the lines). Together they build a tricky mess of scary syntax, difficult concepts, processing hell, and the one question that pops up when you are done skimming through the spec again is: why?
We actually know why: XML has history. For example, it made sense to migrate SGML DTDs to XML back in 1996-1998 when there was nothing else around. Nowadays, for all its shortcomings, people tend to use XML Schema, or the more user-friendly Relax NG and Schematron. There is nothing that DTD offers that these new validation languages don’t.
The same goes for entities. They are today used for various purposes, including specifying characters by name, and inclusions. The former use certainly needs a new solution (some have already been proposed), and the latter is solved with the cleaner approach of XInclude.
Finally, processing instructions are a contentious subject. I for one can live with them, but I wouldn’t complain if they disappeared in an upcoming revision of XML.
What I have realized is that these issues are no revelation in 2006. I propose below links to several related articles and proposals dating back to 2002:
-
Tim Bray’s 2002 XML-SW proposal.
-
Kendall Grant Clark’s 2002 XML 2.0 — Can We Get There From Here? article.
-
Norm Walsh’s 2004 XML 2.0 article, which also points to more resources.
The list of issues raised in those articles and proposals goes further than the two or three issues I tackle above, but DTDs and entities appear to be recurrent.
So what do you do until a hypothetical XML 2.0 sees the light of day? It’s fairly simple: just make sure you don’t unnecessarily carry the burden of legacy and compatibility features such as DTDs and entities, and stick with the features of XML that are simpler to understand.
NOTE: There is currently no such thing as “XML 2.0″. The term is just a placeholder for a hypothetical future version of XML that would improve and simplify XML 1.0 and XML 1.1.


The proceedings for the XTech conference are now online on the XTech web site. Here is the link to our paper:
Yesterday was my first day at
XTech 2006 is going on right now in Amsterdam. As mentioned
You are proud of your well tested web application, and it runs just fine on your own development machine. However, when you deploy it on a Windows XP staging server and perform load testing, you start getting, from time to time, this error message: “Server access Error: Address already in use”. This does not happen in the browser, but on the server. Yes, your web application connects to a server, say making REST API call through HTTP. So that must be the source of the problem: sometimes, the connection from your application to the server just fails.
Postings RSS