[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CDATA sections in W3C XML Infoset
- From: Charles Reitzel <firstname.lastname@example.org>
- To: Bob Kline <email@example.com>
- Date: Sat, 31 Mar 2001 15:36:55 -0500
I'm left agreeing w/ John Cowan's earlier remark. No one is taking CDATA sections away. I ended up using CDATA sections in XML documents for identical reasons as yourself and will continue to do so.
OTOH, I think that if XSLT (and other Infoset based software) ignores the CDATA section markers, so much the better. My application has to ensure the contents of <included-doc> are wrapped in a CDATA section when writing it out, that's all.
This might become parser dependent in the future, which isn't great. But I'll worry about that when and if the time comes. SAX parser adapters are a bit like ODBC drivers. They do a good job of hiding differences between implementations but, no matter what you do, it's never 100%.
take it easy,
At 07:43 PM 3/30/01 -0500, Bob Kline wrote:
>On Fri, 30 Mar 2001, Tim Bray wrote:
>> At 10:54 AM 30/03/01 -0500, Bob Kline wrote:
>> > Find the element containing the CDATA section.
>> > Find the CDATA child of the element.
>> > Hand the value of the CDATA section to the parser.
>> This seems really questionable. Using CDATA sections to embed
>> other XML that's known not to contain any seems just fine.
>> On the other hand, relying on that CDATA section to tell you
>> where it is smells bad. Wouldn't it have been immensely
>> better to do
>> <included-doc><![CDATA[ <something-else /> ..
>> ]]></included-doc> ...
>> Then the CDATA does what it's supposed to, simplify
>> escaping, and the tags do what they're supposed to,
>> provide semantic markup saying what pieces of text
>> really are. -Tim
>You're absolutely right, and that's exactly what we're doing. Our
>command sets look like this:
> <CdrDoc Type='Protocol'>
> <CdrDocCtl>... [some control elements] ...</CdrDocCtl>
> <Protocol> ... [rest of the embedded document] ...
> ... [more commands] ...
>So the step described as "Find the element containing the CDATA section"
>was referring to the location of a specific element (the CdrDocXml
>element at a known position in the hierarchy of a CdrCommand element),
>not a blind stumbling around looking for any CDATA section we happened
>to run into. I can see why you might have come to a different
>conclusion, though, given the less than precise wording. If I hadn't
>been trying to keep the description of each step down to a single line I
>might have said "Find the specific element which we know will contain a
>single child node for the CDATA section for the embedded document."
>Sorry for any confusion my sloppy wording may have caused.
>The approach we're using give us a number of benefits. Some of these
> * We can validate the command set against a DTD.
> * A human can easily examine a document embedded in a command set
> when troubleshooting is required (much harder to do with lots
> of escaping going on).
> * We avoid the additional overhead and complexity of extra
> re-encoding and decoding steps.
>It's clear from this thread that not everyone gives these advantages the
>same weight that we do. I guess we'll just have to warn writers of any
>client software for this system to steer clear of any packages which in
>the name of "Infoset compliance" (to use the phrase of one of the
>contributors to the thread) violate the assumptions we began with about
>the ability to get back CDATA sections where we expect them. And cross
>our fingers that the DOM doesn't get changed to match what's happened to
take it easy,