[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CDATA sections in W3C XML Infoset
- From: Bob Kline <bkline@rksystems.com>
- To: John Cowan <jcowan@reutershealth.com>
- Date: Fri, 30 Mar 2001 11:58:36 -0500 (EST)
On Fri, 30 Mar 2001, John Cowan wrote:
> Bob Kline wrote:
>
> > No? We have quite a bit of code in our XML repository which uses XML
> > commands over sockets for its client-server interface to the rest of the
> > world. Most of the commands embed an XML document being stored in or
> > retrieved from the repository. The embedded documents are wrapped in
> > CDATA sections.
>
> And when the embedded document already contains a CDATA section? Bzzzzt,
> not well-formed.
>
Yes, the inability to nest CDATA sections is a flaw in the XML rec to
which we've resigned ourselves. We don't accept documents into the
repository with CDATA sections. We can do that because we're not a
general-purpose XML repository product.
[...]
> > Therefore information has been lost.
>
> Not so if you encode properly. By changing every "&" in the embedded
> document to "&" and every "<" to "<" (conceptually in that order),
> you get this result:
>
> Original Embedding
> < <
> & &
> < &lt;
> & &amp;
> &lt; &amp;lt
>
> Etc. etc. No information is lost: change every "<" to "<" and
> every "&" to "&" (conceptually in that order) and the exact
> original is restored. In this encoding, ">" characters need not
> be changed.
>
Yuck. We should re-write our software (client and server) because the
W3C changed its mind about what an XML document's tree consists of? If
the W3C was going to stomp on the distinctions enabled by CDATA
sections, it shouldn't have included them in XML in the first place.
--
Bob Kline
mailto:bkline@rksystems.com
http://www.rksystems.com