Lists Home |
Date Index |
From: "Jonathan Robie" <email@example.com>
> At 08:35 AM 4/1/2003 -0800, Dare Obasanjo wrote:
> >In my experience faithful lexical round tripping is mainly important to
> >applications that act as editors. In such cases, the people requesting
> >such features in an API want even more requirements than XML 1.0 deems
> >necessary such as preserving attribute order and all whitespace.
> Yes, it's clearly helpful for that. I also know from XML databases I've
> worked with that people really do get upset if you change namespace
> prefixes when they import a document and export it again - but most editors
> and databases do seem to sacrifice some faithfulness in their lexical
One important application of XML is as source code.
Imagine if a programming editor opened your Perl/Python/C#/Java/C++/SQL
program, renamed names of modules or classes or private methods or packages,
and threw away comments. You would undoubtedly spew, despite your
admirably easy-going nature.
> >If the Information Set says that there is no distinction between <foo
> >"a"/> and <foo 'a'/>, why should I work hard to preserve the
Because syntactic sugar is vital for humans, and can help processing.
This here thread comes out of a complaint about XML being too complex
to use regexes. Yet if we canonicalize our data (say, including that only <foo
x="a" /> is used ) then the regular expressions simplify themselves to something
much more useable. If other software messes up this canonical form,
then we have to re-canonicalize it. (Which suggests not that we should
work hard to preserve the distinction, but that if it is convenient we should