[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CDATA sections in W3C XML Infoset

From: Richard Tobin <richard@cogsci.ed.ac.uk>
To: xml-dev@lists.xml.org
Date: Wed, 28 Mar 2001 12:37:55 +0100 (BST)

> Could someone explain to me why CDATA section start/end markers were
> taken out of the W3C Infoset?

Two main reasons:

(a) They are not robust in the face of character-set translation.  The
    characters that can appear in them are just the ones in your encoding,
    since character and entity references are not available.  So if you,
    say, include a Cyrillic letter in a CDATA section in your UTF-8
    document, and then translate it to Latin-1, you will have to break
    the CDATA section to use a character reference.

(b) They are purely syntactic sugar.  Nothing (except an editor or similar)
    should treat the following differently:

    AT&#38;T is a large corporation
    <![CDATA[AT&T]]> is a large corporation
    <![CDATA[AT&T is a large corporation]]>

> I was hoping to use the Infoset in writing the spec for XML Script,

Did you intend to give variants like those in (b) different meanings
in XML Script?

-- Richard

Follow-Ups:
- Re: CDATA sections in W3C XML Infoset
  - From: Richard Lanyon <rgl@decisionsoft.com>

References:
- CDATA sections in W3C XML Infoset
  - From: Richard Lanyon <rgl@decisionsoft.com>

Prev by Date: The Web's Full Potential (Was Re: experts)
Next by Date: Re: CDATA sections in W3C XML Infoset
Previous by thread: CDATA sections in W3C XML Infoset
Next by thread: Re: CDATA sections in W3C XML Infoset
Index(es):
- Date
- Thread