[
Lists Home |
Date Index |
Thread Index
]
At 08:10 26/04/2002 -0400, W. E. Perry wrote:
>John Cowan wrote:
> > Sure it does. The Infoset is ridiculously close to the XML surface; it
> > just abstracts away crap like "How many spaces between attributes?" and
> > "What kind of quotation mark?" and the like. About the only thing that
> > disappears without a trace is the physical entity structure.
:
>Nevertheless, in philology
>the pivotal discovery of the twentieth century was of the primacy of
>syntax.
I've seen this debate go round a few times now, and I confess to not having
understood it yet, probably because I don't have the background in
philosophy to understand the terms being used.
In so far as I /do/ understand, it seems that some believe that what XML is
really about is a logical data model consisting of elements, attributes and
suchlike (the infoset's Information Items). The concrete XML syntax is a
language which can be used to represent that logical data model.
The others believe that the concrete character-by-character syntax of the
XML is all that matters, and that trying to pretend that it is a
representation of some more abstract data model is bound to result in
discarding semantic information which was important to the author.
I'm assuming that this latter camp, for example, believes that section
3.3.3 of the XML Rec, which starts
Before the value of an attribute is passed to the application
or checked for validity, the XML processor must normalize
the attribute value by applying the algorithm below...
is exactly such a piece of poor design, and that the validation algorithm
ought to have been specified to work on the attribute value as written. If
this assumption is wrong, I'd like an explanation -- it seems to me that
this mandatory normalization is exactly equivalent to an infoset-style
abstraction.
I guess we all agree that some degree of abstraction from the underlying
representation is desirable? No-one cares that my XML data has actually
been split into 512 byte chunks for storage in some physical filing system.
/Are/ we just disagreeing about the amount of abstraction which is
desirable, or is there some difference of kind between the good
abstractions and the bad ones?
For what it's worth, I find myself in the pro-infoset camp. One of the
strengths of XML is that it allows me to compose specific applications out
of general purpose tools (e.g. SAX, XSLT). In the absence of some notion of
the logical data model represented by XML, these general purpose tools are
not going to be composable in this manner.
--
Cheers,
John
|