[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
XML 2 so far
- From: Liam R E Quin <liam@w3.org>
- To: xml-dev@lists.xml.org
- Date: Sun, 12 Dec 2010 20:42:50 -0500
Here are my notes on feature requests so far for an XML 2.0.
After the items, a short discussion.
Liam
(1) allow leading whitespace before the XML declaration
<?xml ...?>
Why: It happens often by mistake, or as a result of copy and
paste, and results in the declaration becoming interpreted as a
processing instruction, or in a possibly-cryptic error message.
This seems not to be controversial.
(2) character set
require the use of utf-8, or of utf-8 and -16, and forbid others.
Not complete consensus here.
(3) document type declaration - external DTD
Remove external DTDs.
Not complete consensus on what to do with entities.
(4) internal subset (e.g. element and entities declared in DTD-style
notation at the start of a document)
I don't see consensus here. People do want a way to define
"macros" or something similar that can appear in attribute values as
well as in elements, and XInclude can't do that.
(5) multiple root elements
Allow multiple root elements in a document.
Why? Because people want it. There's no technical need.
On the other hand, it may break existing APIs and tools.
Seems to be weak consensus on doing this one.
(6) Lax syntax and error recovery
There's strong demand to allow processors to do error recovery,
from some user communities. This mostly seems to me to be
Web browser programmers who deal with faulty RSS a lot; on the
other hand, e.g. SOAP people would fight hard to keep this out
(and it's certainly not a feature of JavaScript or JSON either).
Not clear consensus here.
(7) Minimization
This overlaps with No. 6, lax syntax. Many people want to use
a terser syntax, or have it as an option. There is not (yet)
strong consensus on what that should be. Some people want
<e>....</> or <e/..../ as per SGML. But there is not strong
support for the exact SGML OMITTAG rules I think (which are
complex and require a DTD)
Neither is there support for DATATAG or the other SGML features
exactly, but there do seem to be people who want some sort of
terser markup.
There has even been a LISP-like syntax suggested.
The counter-arguments are usually simplicity and robustness.
Not yet consenus.
(8) Attributes
Allowing multiple attributes with the same name, or allowing
markup inside attribute values, or allowing an equivalence
between
<e a="v"... and <e><a>v</a>...
in some way, have been requested, but not any agreement on the
details. Most changes to attributes would require changes to
XPath, XSLT, XPointer, XQuery, XSD, RelaNG, Schematron, etc.,
although it seems likely any XML 2.0 would necessitate changes.
Possibly a reserved <xml:attributes>...</xml:attributes> wrapper
would alleviate some of the difficulties (but cause others).
(9) comments
discussion on <xml:comment>...</xml:comment> with no real
agreement emerging. Issues include whether the content of
comments would need to be well-formed, and whether you could
nest this sort of comments (not if the contents aren't well-formed).
The <!-- [not --]* --> syntax is unpopular.
(whether my proposal of <!--*.... *-->, which would have allowed --
within a comment, could have been accepted is history; it was
taken out of the draft before there was the possibility of changing
SGML itself... at which point we could have had <(...)> or
something)
(10) graphs vs documents
A long thread on this, with nothing clear emerging (I think).
What did I miss?
The biggest missing question I see is, who are the stakeholders? What
is the business case here?
The XML 1 world will continue, with (literally) billions of dollars
invested in the current infrastructure. Obviously if you're generating
a UPNP message for a television, you've got to send XML 1.0 messages.
Similarly in air traffic control, or between an car engine and the
diagnostic tool.
The XML geeks (myself included) would love to smooth some corners off,
but that doesn't translate into an interchange story.
When we started XML (as "Web SGML") we had the idea that every WebsGML
document would be an SGML document - that was very important to us.
So, we couldn't change the syntax beyond the limits allowed by SGML's
SGML Declaration. Making the DTD optional was very controversial,
not least because it meant that SGML tools needed to be modified in
order to read XML, even though there was a sense in which the
documents were still legal SGML.
People have talked about every well-formed XML 2 document also being
an XML 1 document, or (more controversially) about having a well-defined
conversion.
Probably there would have to be an XML 1.0 representation of XML 2
documents, rather like ipv6 network packets tunnelling though ipv4.
But it's still not clear who would benefit from XML 2.0 (I'm not
saying no-one would benefit, only that it's not yet clear who).
In addition, there are a number of significant XML user communities
that we haven't heard say much in this discussion -- e.g. representing
Web Services or embedded XML processing.
Liam
--
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org irc.freenode.org
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]