XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] It's too late to improve XML ... lessons learned?

<snip>
XML has been very resistant to this kind of evolution. We've seen XML parsers that don't support some features in the standard (notably DTDs), but we've seen little tendency to support extensions (such as relaxing the rules on element and attribute names, or allowing the outermost element to be omitted). Perhaps this is because there are so many parsers in use and because they aren't easy to change. It certainly means that XML is a much stronger interoperability standard than some others; but it does lead to a certain amount of frustration because everyone can see that some of the rules (like disallowing nested comments) are just plain silly.
</snip>

Part of this lack of evolution came about because parsers are very much tied into the language frameworks in which they're written, and once a language framework establishes a default implementation, most developers in that language cease looking elsewhere for potential upgrades, preferring instead to work with obsolete packages. Xalan was the default Java implementation, and it never moved beyond XSLT 1.0. The change to switch to a modern XSLT implementation is one line of code, but the number of people who know to change that one line of code is disturbingly small.

I'd also contend that the outright hostility of the HTML5 group towards XML in any form in the browser has kept the most obvious point where evolution could occur - the web browser - did not help matters much, to the extent that they bent over backward trying to come up with a non-namespace namespace solution for web components rather than admit that XML did get something right. It should be a fairly trivial thing to implement a streamable XML parser in node.js (there are ten listed with a quick Google search (https://openbase.com/categories/js/best-nodejs-xml-parser-libraries). 

Personally, it may be time to take another stab at an XML 2.0 stack, with amendments for XPath, XQuery, and XSLT that are already written to acknowledge streaming, starting with acknowledging a "canon" _javascript_ parser, then moving out to Python and then Java. JSON is beginning to creak under its own limitations, and a new generation of developers and data scientists may likely be more amenable to something that captures the best of both. JSON is piss-poor at capturing narrative structures, XML's namespaces are unwieldy, streaming is a must, and identifiers ... (don't get me started on identifiers). 

Kurt Cagle
Community/Managing Editor
Data Science Central, A TechTarget Property
443-837-8725


On Wed, Jan 5, 2022 at 3:36 AM Michael Kay <mike@saxonica.com> wrote:
>It seems that many Big Data systems read in a dialect of JSON, with one JSON "file" per line. 

Indeed, that's a popular format, and it's an example of how standards can evolve through community evolution. Perhaps it will become popular enough that someone writes an RFC for it and allocates it a media type.

This kind of evolution has benefits and drawbacks. You often end up with a nice new capability that isn't universally supported (for example, use of non-Latin characters in email addresses), and it's a fine line whether the new capability is useful if not everyone supports it. At worst, you end up with a standard that's so woolly it's a nightmare (take CSV as an example).

XML has been very resistant to this kind of evolution. We've seen XML parsers that don't support some features in the standard (notably DTDs), but we've seen little tendency to support extensions (such as relaxing the rules on element and attribute names, or allowing the outermost element to be omitted). Perhaps this is because there are so many parsers in use and because they aren't easy to change. It certainly means that XML is a much stronger interoperability standard than some others; but it does lead to a certain amount of frustration because everyone can see that some of the rules (like disallowing nested comments) are just plain silly.

Michael Kay
Saxonica




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS