[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Syntax Sugar and XML information models
- From: John Aldridge <email@example.com>
- To: xml-dev <firstname.lastname@example.org>
- Date: Thu, 29 Mar 2001 11:34:38 +0100
At 21:13 28/03/2001 -0500, Michael Champion wrote:
>So, does ANYBODY care about round-tripping a) the specific quote characters
>around attribute values, b) the order of attributes; c) character entity
>references for characters that are in the specified character set d) the two
>diferent syntaxes for empty elements, .... ?
Yes, but only sometimes. I _do_ mind if editors unpredictably change these
things, because I'm going to store XML data in RCS, and expect rcsdiff,
rcsmerge & the like to do sensible things.
Actually, from this point of view, there are two workable solutions:
(a) Editors don't change syntax sugar, except when the user edits something
at or near the place in question, or
(b) Editors all write a _standard_ normal form (i.e. not just a normal form
of their own choosing)
I've been here with HTML before, for example: the HTML editors AOLpress
and FrontPage Express both normalise the HTML they write in some respects,
but to different rules. So if some developers edit technical documentation
using AOLpress and others prefer FrontPage Express, this makes the RCS
differencing tools essentially useless.
Whatever, I prefer (a), for much the same reasons as I wouldn't want to use
a "C" source code editor which always pretty-printed its
output. Indentation and the like should be under my control.
You said it yourself...
>So, there seem to be two classes of things that the InfoSet doesn't cover:
>the "mere syntax" that no reasonable application (except maybe a "diff")
>would care about
except that I think this is an important application, not one which should
be swept under the carpet.
I'm in full support of the goals of the infoset: I want agreement on the
significant information content of an XML file. The XML spec itself
started on this road in various areas, such as attribute value
normalization -- the infoset is finishing the job.
It's just that some applications need to operate directly on the
representation, and not just on the information.