OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: CDATA by any other name... (was The raw and the cooked)

[ Lists Home | Date Index | Thread Index ]
  • From: John Cowan <cowan@locke.ccil.org>
  • To: XML Dev <xml-dev@ic.ac.uk>
  • Date: Wed, 04 Nov 1998 12:00:56 -0500

Paul Prescod wrote:

> XML DTDs are in the business of constraining people to the data models and
> data that the software is expecting/can deal with. I don't see any big
> difference between saying: "This content must be restricted to this set of
> characters" and "this content must be a NMTOKEN or base-64 encoded."

Put that way, I suppose you are right.  As I said before, this could and
should be handled as a special case of "The character data of this
element must conform to the following regular expression."

> Nevertheless, this is clearly a schema problem and CDATA sections seem to
> me to be a really bad tool for enforcing this distinction.

Particularly because it would mean that the charset of an XML document
would become part of its schema: a document in US-ASCII can have
only ASCII in its CDATA sections, but if it were transcoded to
ShiftJIS, then it could have any JIS X 208 character in the
CDATA section.

So this means that transcoding arbitrary XML documents *requires*
parsing them, because if you are reducing the repertoire, you may need
to break up CDATA sections, and you cannot (?) recognize a
CDATA section reliably without parsing.  (In particular, what
looks like a CDATA section start/end could appear as an attribute
value, PI data, or comment.)  An interesting side effect!

John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS