[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
RE: [xml-dev] Tradeoffs of XML encoding by enclosing all content in CDATA blocks
- From: "Michael Kay" <mike@saxonica.com>
- To: "'Karr, David'" <david.karr@wamu.net>,<xml-dev@lists.xml.org>
- Date: Mon, 29 Sep 2008 16:22:34 +0100
Title: Tradeoffs of XML encoding by enclosing all content in CDATA blocks
The best argument is that people who adopt this approach
usually fail to check for the presence of "]]>", which isn't allowed in CDATA
sections. Once you start checking for that and dealing with it properly, it
turns out to be easier to check for & and < and escape them as &_amp;
and &_lt; respectively. (Underscores inserted to prevent
misformatting).
Also, the code for escaping & and < works for both
elements and attributes (though attributes also need some attention to look for
quotes), whereas the CDATA approach only works for elements.
Michael Kay
I pointed out to a client that they're
seeing failures parsing XML because some of the element content that they're
producing contains characters illegal in XML content, like "&"
(unencoded). They acknowledged that should be fixed, but they also said
they could instead enclose all content with CDATA blocks. That seems
bizarre to me, but I'm not sure I can immediately come up with all the cogent
arguments against that. Can someone summarize specifically why you
should NOT do that?
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]