[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Tradeoffs of XML encoding by enclosing all contentin CDATA blocks
- From: Manos Batsis <manos_lists@geekologue.com>
- To: "Karr, David" <david.karr@wamu.net>, XML Developers List <xml-dev@lists.xml.org>
- Date: Mon, 29 Sep 2008 18:25:17 +0300
Karr, David wrote:
> I pointed out to a client that they're seeing failures parsing XML
> because some of the element content that they're producing contains
> characters illegal in XML content, like "&" (unencoded). They
> acknowledged that should be fixed, but they also said they could instead
> enclose all content with CDATA blocks. That seems bizarre to me, but
> I'm not sure I can immediately come up with all the cogent arguments
> against that. Can someone summarize specifically why you should NOT do
> that?
One thing that comes in mind is that switching to CDATA may break some
existing code.
For example, although XPath exposes only one text node as the content of
the following element:
<foo> text 1 <![CDATA[ text 2]]> text 3</foo>
AFAIK DOM will expose the children of <foo> as at least three nodes
(text node text 1 , CDATA section text 2, text node text 3). Code using
text normalization will also not work (not sure about DOM2+).
Should be easier and safer to just add some code to escape three
characters to their entity equivalents.
hth,
Manos
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]