[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CDATA sections in W3C XML Infoset
- From: Richard Tobin <richard@cogsci.ed.ac.uk>
- To: xml-dev@lists.xml.org
- Date: Wed, 28 Mar 2001 12:37:55 +0100 (BST)
> Could someone explain to me why CDATA section start/end markers were
> taken out of the W3C Infoset?
Two main reasons:
(a) They are not robust in the face of character-set translation. The
characters that can appear in them are just the ones in your encoding,
since character and entity references are not available. So if you,
say, include a Cyrillic letter in a CDATA section in your UTF-8
document, and then translate it to Latin-1, you will have to break
the CDATA section to use a character reference.
(b) They are purely syntactic sugar. Nothing (except an editor or similar)
should treat the following differently:
AT&T is a large corporation
<![CDATA[AT&T]]> is a large corporation
<![CDATA[AT&T is a large corporation]]>
> I was hoping to use the Infoset in writing the spec for XML Script,
Did you intend to give variants like those in (b) different meanings
in XML Script?
-- Richard