OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
XML 2 so far

Here are my notes on feature requests so far for an XML 2.0.

After the items, a short discussion.


(1) allow leading whitespace before the XML declaration
    <?xml ...?>

    Why: It happens often by mistake, or as a result of copy and
    paste, and results in the declaration becoming interpreted as a
    processing instruction, or in a possibly-cryptic error message.
    This seems not to be controversial.

(2) character set
    require the use of utf-8, or of utf-8 and -16, and forbid others.
    Not complete consensus here.

(3) document type declaration - external DTD
    Remove external DTDs.
    Not complete consensus on what to do with entities.

(4) internal subset (e.g. element and entities declared in DTD-style
    notation at the start of a document)
    I don't see consensus here.  People do want a way to define
    "macros" or something similar that can appear in attribute values as
    well as in elements, and XInclude can't do that.

(5) multiple root elements
    Allow multiple root elements in a document.
    Why? Because people want it. There's no technical need.
    On the other hand, it may break existing APIs and tools.
    Seems to be weak consensus on doing this one.

(6) Lax syntax and error recovery
    There's strong demand to allow processors to do error recovery,
    from some user communities.  This mostly seems to me to be
    Web browser programmers who deal with faulty RSS a lot; on the
    other hand, e.g. SOAP people would fight hard to keep this out
    (and it's certainly not a feature of JavaScript or JSON either).
    Not clear consensus here.

(7) Minimization
    This overlaps with No. 6, lax syntax.  Many people want to use
    a terser syntax, or have it as an option.  There is not (yet)
    strong consensus on what that should be.  Some people want
    <e>....</> or <e/..../ as per SGML. But there is not strong
    support for the exact SGML OMITTAG rules I think (which are
    complex and require a DTD)

    Neither is there support for DATATAG or the other SGML features
    exactly, but there do seem to be people who want some sort of
    terser markup.

    There has even been a LISP-like syntax suggested.
    The counter-arguments are usually simplicity and robustness.
    Not yet consenus.

(8) Attributes
    Allowing multiple attributes with the same name, or allowing
    markup inside attribute values, or allowing an equivalence
      <e a="v"... and <e><a>v</a>...
    in some way, have been requested, but not any agreement on the
    details. Most changes to attributes would require changes to
    XPath, XSLT, XPointer, XQuery, XSD, RelaNG, Schematron, etc.,
    although it seems likely any XML 2.0 would necessitate changes.
    Possibly a reserved <xml:attributes>...</xml:attributes> wrapper
    would alleviate some of the difficulties (but cause others).

(9) comments
    discussion on <xml:comment>...</xml:comment> with no real
    agreement emerging.  Issues include whether the content of
    comments would need to be well-formed, and whether you could
    nest this sort of comments (not if the contents aren't well-formed).

    The <!-- [not --]* --> syntax is unpopular.
    (whether my proposal of <!--*.... *-->, which would have allowed --
    within a comment, could have been accepted is history; it was
    taken out of the draft before there was the possibility of changing
    SGML itself... at which point we could have had <(...)> or

(10) graphs vs documents
    A long thread on this, with nothing clear emerging (I think).

What did I miss?

The biggest missing question I see is, who are the stakeholders?  What
is the business case here?

The XML 1 world will continue, with (literally) billions of dollars
invested in the current infrastructure.  Obviously if you're generating
a UPNP message for a television, you've got to send XML 1.0 messages.
Similarly in air traffic control, or between an car engine and the
diagnostic tool.

The XML geeks (myself included) would love to smooth some corners off,
but that doesn't translate into an interchange story.

When we started XML (as "Web SGML") we had the idea that every WebsGML
document would be an SGML document - that was very important to us.

So, we couldn't change the syntax beyond the limits allowed by SGML's
SGML Declaration.  Making the DTD optional was very controversial,
not least because it meant that SGML tools needed to be modified in
order to read XML, even though there was a sense in which the
documents were still legal SGML.

People have talked about every well-formed XML 2 document also being
an XML 1 document, or (more controversially) about having a well-defined

Probably there would have to be an XML 1.0 representation of XML 2
documents, rather like ipv6 network packets tunnelling though ipv4.

But it's still not clear who would benefit from XML 2.0 (I'm not
saying no-one would benefit, only that it's not yet clear who).

In addition, there are a number of significant XML user communities
that we haven't heard say much in this discussion -- e.g. representing
Web Services or embedded XML processing.


Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org irc.freenode.org

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS