OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Lessons learned from the XML experiment

On 13 Nov 2013, at 10:50, Stephen Cameron <steve.cameron.62@gmail.com> wrote:

> Taking these issues in order, maybe I can ask a few questions?
> Whitespace: surely if default whitespace normalisation is an issue, then that can be solved by CDATA sections (which are used extensively by GMail I just noticed)?

The first issue with whitespace is that sometimes whitespace text nodes are significant and sometimes not, and there's no simple way of deciding which. So applications sometimes get presented with whitespace they don't want, leading to messiness like incorrect numbering of nodes, and sometimes they don't get presented with whitespace that they do want, leading to data corruption.

The second issue is whitespace normalization in attributes. This is mainly a problem because the type system for attributes is so weak (it would be better to have none at all). If strings, numbers, and booleans were clearly distinguished as in JSON (or FtanML) then more work could be done by the parser, and less would need to be done by applications.

In both cases, applications have to deal with a problem that shouldn't exist, and information gets corrupted along the way.

> Namespaces: Without them XML wouldn't have a reason for being surely, I cannot understand this being considered a problem, other than perhaps browsers support them in different ways?

Namespaces account for a very significant chunk of user difficulties with XML, a great deal of the complexity of specifications like XSD and XSLT, a similar proportion of the complexity of APIs, and a vast amount of the code in implementations of these specs. And they aren't necessary! The world could have managed perfectly well with a convention where the element name <org.w3c.svg> means "in this subtree, I'm using SVG element names".

The problems with namespaces are:

(a) URIs are unwieldy, too unwieldy to use all the time, therefore prefixes were introduced

(b) Dealing with names that can't be represented as simple strings makes EVERYTHING more complicated (e.g APIs)

(c) Prefixes make the meaning of XML fragments context-dependent, so there's lots of machinery (e.g. in XSLT) to carry context around

(d) There's no single universally-agreed definition of the data model (e.g. are redundant namespace declarations significant).

> XSD: This is a big one, maybe time to have another W3C recommendation alongside XSD?

The alternatives exist, but too late to make a difference. XSD was too successful to enable us either to ignore it or replace it.
> DOM API: For the norms of the times probably a good effort. jQuery etc. has made that API somewhat redundant in the browser, so maybe just recognise jQuery as a DOM DSL (but adding XPath support would be cool). You have to admit that D3.js is quite marvelous (which uses the jQuery approach), now that we finally have SVG support cross browser.

Again, the alternatives exist, but the damage is done. Most newcomers to Java/XML use DOM because it's there and they think its the standard, and we can't stop this, because we can't take stuff away and we can't rewrite the books that we wrote 15 years ago.
> But my main issue is this: If things can be improved then do so,

There's no doubt things we could design something much better, the problem is transition. Most attempts to improve the situation in the last 10 years haven't caught on, for several reasons: you can't get rid of complexity by adding things, and you can't take users with you if you do something completely different. FtanML (presented at Balisage 2013) shows one attempt to create a better XML by throwing everything away and starting again, but people have too much investment in the current technology stack for that kind of approach to have a realistic chance of success.

What tends to happen in this situation is that a new technology comes along that's focussed on a different set of requirements, and it starts to eat away at some of the application space in areas where it offers benefits. Then of course people discover that the new stuff has limitations, so it starts becoming more complex in its own right. Hopefully though, as with evolution of programming languages, there's a gradual and long term learning of lessons.

Michael Kay

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS