Lists Home |
Date Index |
Jonathan Robie writes:
> Now that we are celebrating the fifth anniversary of XML , it may be
> time to re-read one of the seminal papers on XML, " .
> XML was always supposed to work well both for structured documents and
> serializing structured data from various sources. Any view of XML that
> doesn't allow for both, and despairs that other people may be using XML
> differently than you are, misses the original vision.
Jonathan is exactly right: this is the source of the recurring
"simplified XML" announcements, which have become as predictable as
"imminent death of the Internet predicted". By "simplified XML",
people really mean "specialized XML" (for data, documents, or whatever
they happen to want to do).
It is true that there is some useless cruft in XML that was included
only for political reasons: public identifiers, notations and external
entities serve no function that MIME types (or URIs -- sorry, Simon)
and URLs couldn't serve, but we had to keep them in XML as part of an
unwritten ceasefire agreement with the SGML old guard (*), which was
still powerful at the time and could have seriously hindered
acceptance of XML both inside and outside the W3C; the other part of
that ceasefire was to pretend that XML and SGML would coexist, with
XML for lightweight Web and SGML for so-called "serious enterprise
applications" (the vendors put paid to that idea by abandoning SGML so
fast that we couldn't keep up with the press releases).
Aside from those minor problems, though, the main difficulty with XML
is precisely what Jonathan says -- it is meant for both documents and
data, and as a result, is optimized for neither. What we need is not
a new, simpler XML, but separate, more specialized layers on top of
XML for data and documents (**). RDF, Topic Maps, and the SOAP
serialization format all represent candidates for a standard data
layer; unofficially, XHTML and DocBook are often used as the bases for
building more specialized documentation formats. Maybe we need
something simpler than any of that to start with.
All the best,
(*) Yes, I was initially part of that SGML old guard.
(**) The "document/data" dichotomy is inexact -- "human-targetted
information" and "machine-targetted information" would be more
accurate -- but everyone seems to know what it means, so I'm sticking
David Megginson, email@example.com, http://www.megginson.com/