[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
RE: [xml-dev] Nested Documents (was: XML 2.0)
- From: "Michael Kay" <mike@saxonica.com>
- To: "'Richard Salz'" <rsalz@us.ibm.com>,"'COUTHURES Alain'" <alain.couthures@agencexml.com>
- Date: Wed, 27 Feb 2008 18:21:39 -0000
Thanks, but I still don't understand what you mean by "more
tightly linked". You're just repeating the assertion that a final end tag is
good for me because it enables me to detect more errors - which I have
already explained is the wrong tradeoff in my case. In fact, if my log file
really is broken for some reason, then I would rather be able to read most of it
than none of it.
Michael Kay
> I don't understand what you mean by "more
tightly linked"...
In a single-rooted
document, <a>...</a>, when I see that final closing bracket I can
be pretty sure that I've got everything. (Yes, trailing comments and
PI's I handwave aside.) If the input stream ends before that closing
bracket, then I know something went wrong. My XML consuming application
can pretty much treat the input as a stream of bytes and not care where how or
why something went wrong; at least at the first level. See the bracket,
things ok; don't see the bracket, something broke.
Now, let's imagine a multi-rooted document like
<a>...</a><b>...</b>. How do I know when I hit
the end? Suppose the network drops after the </a>? If you're using
something like TCP socket API, you have no idea how many bytes were in-flight
when the connection dropped. (TCP is realiable only when its working;
it's failures modes are pretty pitiful.) Suppose I'm reading a log file
and there's no free space on the disk but I do see the closing </b>.
Is that coincidence -- did the producer just get lucky and take the last
bits of the disk, or am I missing <c>...</c> and
friends?
There's no way to know, in a
generic way, without doing very specific things depending on when where why
and how the XML is transported. Right now, you don't have to. Yes,
that's a simplification, 80/20, whatever you want to call it. I'd call
it a trade-off and say the current design is the right one. :)
It has issues in the real world, too. Suppose
you're writing a book and someone inserts a pagebreak between <a> and
<b> in a multi-root document. Are they gonna get confused or
misled if they're not explicitly told to turn the page?
/r$
--
STSM, DataPower Chief
Programmer
WebSphere DataPower SOA
Appliances
http://www.ibm.com/software/integration/datapower/
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]