[
Lists Home |
Date Index |
Thread Index
]
On Tue, 01 Apr 2003 09:48:59 +0100, Sean McGrath
<sean.mcgrath@propylon.com> wrote:
[Checking very carefully to see if this is one of Sean's famous April Fool
jokes ... hmm, no that's another thread]
> Correctness or input fidelity - pick one - you cannot have both.
>
> This is at the core of why I've always argued that we *do* need a data
> model for XML and we *do* need something like
> common XML because I want my processing to be both correct *and* non-
> lossy (high input fidelity).
>
> Is that too much to ask?
Let me make sure I understand ... we need a definitive data model so that
one can work with the normalized information in an XML document
irrespective of whatever "syntax sugar" was used to represent the
information, and we need something like Common XML to define a canonical
serialization of the data model that can will not lose fidelity through
successive parse / serialization stages?
I agree. It sounds like existing data models don't quite do the job
because they don't (except for the DOM data model, which has its own
problems) let one keep unexpanded entity references around. Likewise
Common XML as sml-dev defined it doesn't include entity definitions and
references.
I strongly agree if we're saying that XML (or some successor) needs a) to
treat the syntax and data model as two halves of the same whole; b) to
*conceptually* handle "syntax sugar" in a preprocessing phase where CDATA
sections are handled, whitespace normalized, quotes standardized, [entities
expanded ???], comments stripped out, [PI's stripped out???], etc.; c) the
actual core grammar is based on the "Common XML" so text operations on the
common/canonical syntax can be correct and non-lossy; d) alternate
serializations of the data model are acknowledged as "legal" insofar as
they reliably and losslessly round trip with the common/canonical syntax;
e) additional information such as that introduced by schemas and other
datatyping schemes is another layer on top of all this.
That lets XML be text for text processing people and Desperate Perl/Python
Hackers, and XML be data for data processing people, sharing common
technologies where appropriate but adding different layers for specialized
needs where appropriate.
|