XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Tradeoffs of XML encoding by enclosing all contentin CDATA blocks

Karr, David wrote:
> I pointed out to a client that they're seeing failures parsing XML 
> because some of the element content that they're producing contains 
> characters illegal in XML content, like "&" (unencoded).  They 
> acknowledged that should be fixed, but they also said they could instead 
> enclose all content with CDATA blocks.  That seems bizarre to me, but 
> I'm not sure I can immediately come up with all the cogent arguments 
> against that.  Can someone summarize specifically why you should NOT do 
> that?

One thing that comes in mind is that switching to CDATA may break some 
existing code.

For example, although XPath exposes only one text node as the content of 
the following element:

<foo> text 1 <![CDATA[ text 2]]> text 3</foo>

AFAIK DOM will expose the children of <foo> as at least three nodes 
(text node text 1 , CDATA section text 2, text node text 3). Code using 
text normalization will also not work (not sure about DOM2+).

Should be easier and safer to just add some code to escape three 
characters to their entity equivalents.

hth,

Manos




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS