xml-dev - Re: PCDATA vs CDATA

Re: PCDATA vs CDATA

[ Lists Home | Date Index | Thread Index ]

From: Richard Tobin <richard@cogsci.ed.ac.uk>
To: "Tom Otvos" <tomo@everyware.com>, "XML Dev" <xml-dev@ic.ac.uk>
Date: Tue, 30 Jun 1998 23:35:26 +0100

> Hmm, is that the only case where an XML parser might do the "wrong thing" if
> it came across a document without a supporting DTD?  

Yes.  There are some things an XML parser can't do without a DTD:

- validating (obviously)
- determining which whitespace is ignorable
- normalising attributes and inserting default values
- expanding entity references

but despite those constraints it can parse the document and determine
whether it is well-formed.

> It seems to me that if
> a document comes through without a DTD, and an element contained data not
> explicitly escaped, then it would not be unreasonable to assume PCDATA and
> try to parse it.  However, if a DTD is there to provide more info, then use
> it.  I am not sure I see how it is significantly different than validating
> that an element may, or may not, be a child of another element.

If the parser doesn't know that the content of an element is CDATA it
will very likely parse a correct document wrongly.  This is not the
case if it just doesn't know what children are allowed.

For example, if c were declared CDATA and the parser didn't have the
DTD, it would report a syntax error for

  <c>></c>

Various other features of SGML have been omitted for the same reason,
in particular start- and end-tag omission.  Similarly a new syntax has
been created for empty elements, because without the DTD a parser
can't tell that an element must be empty.

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

Prev by Date: Re: 'Optional' vs 'Implied' in XSchema
Next by Date: Re: XSchema Spec, Section 3, Draft 1 (Namespaces)
Previous by thread: Re: PCDATA vs CDATA
Index(es):
- Date
- Thread