[
Lists Home |
Date Index |
Thread Index
]
Pretty much the case. XML is syntax. An XML parser is
expected to find syntax errors and As Much Structure As
You Provide In A DTD/Schema. Everything interesting
past the syntax parse is in the application language.
XML Doesn't Care. An application can structure down to
individual characters if it cares enough but if one
cares that much, XML is likely not the best syntax
to use anyway. There may be some undefined boundary
but it would likely not be discovered by the structure but
entropic measures of the system itself.
And that might be the best response for the XML Core
WG to make to Berners-Lee's request: XML is not the
right place to solve the particular problem of having
a transmission format that is invariant to transformation
into and out of alternative formats.
It is interesting to note that after the CAD WG of
the Web3DC debated the merits of a standard for
CAD on the Web, they talked in terms of a JPEG for CAD,
not a markup language.
len
From: Hunsberger, Peter [mailto:Peter.Hunsberger@stjude.org]
Hmm, I don't think I buy this argument. This gets a back to Len's
question about "noise" and my response about error correction. While it
is true that XML carries with it some extra information that makes it
somewhat more robust in the face of corruption it isn't true that this
extra information allows you to understand what the errors where. (Nor
is it true that all binary formats are useless in when a byte is
"blown", many of them carry enough information that errors are
recoverable.)
The fact is, the extra information in XML isn't of a form that allows
you to do robust error recovery. It might allow you to decide that you
understand some portion of the document, but if a document is corrupt do
you really want to assume that a missing end tag is the only problem and
that the real problem isn't that an entire paragraph including the
missing end tag got deleted?
|