xml-dev - Re: Mixed content considered harmful...

Re: Mixed content considered harmful...

[ Lists Home | Date Index | Thread Index ]

From: John Cowan <cowan@locke.ccil.org>
To: XML Dev <xml-dev@ic.ac.uk>
Date: Tue, 11 May 1999 14:36:56 -0400

Paul Prescod wrote:

> #PCDATA is just a data type that is unconstrained. You should be able to
> mix data type refs, #PCDATA and element type refs in content models with
> impunity (barring real parsing ambiguity). Using old syntax:
> 
> <!ELEMENT SECTION (#PCDATA, P+)>
> <!ELEMENT FIG (#PCDATA|IMG)>
> <!ELEMENT HTML (TITLE,(#PCDATA|P)+)>

I need to be convinced of this.

Can you sketch an algorithm that will convert SGML-style (or &-less
SGML-style) content models involving #PCDATA into content models
involving #PCDATA and #WS, where #WS is a data type that matches
only white space, such that random white space around tags will be properly
accounted for?

Remember that XML parsers are required to pass along all whitespace,
(i.e. it all appears in the infoset, barring whitespace outside the
root element), so it needs to be accounted for somehow, when it is
#PCDATA and when it isn't.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

Follow-Ups:
- Re: Mixed content considered harmful...
  - From: Paul Prescod <paul@prescod.net>

References:
- Mixed content considered harmful...
  - From: Paul Prescod <paul@prescod.net>

Prev by Date: Re: Argh...Entities
Next by Date: Re: PI target names
Previous by thread: Mixed content considered harmful...
Next by thread: Re: Mixed content considered harmful...
Index(es):
- Date
- Thread