Lists Home |
Date Index |
- From: Lars Marius Garshol <firstname.lastname@example.org>
- To: "XML Developers' List" <email@example.com>
- Date: 25 Mar 1999 11:01:43 +0100
* David Megginson
| public abstract void startCDATA ()
| throws SAXException;
| public abstract void endCDATA ()
| throws SAXException;
This implies that the parser reports the contents of CDATA sections as
separate DocumentHandler.characters events, which is of course the
most natural way to implement things anyway.
However, the 1999-03-12 list of core features contains this:
Ensure that all consecutive text is returned in a single callback to
DocumentHandler.characters or DocumentHandler.ignorableWhitespace
(true) or explicitly do not require it (false).
This is potentially problematic, since it's unspecified what the
parser should do about CDATA sections in this case. (I suspect we will
see more problems of this kind when we start using really using and
stacking filters.) Should they be normalized, or should they be
reported separately? (Ie: what is consecutive text, exactly?) The same
problem appears with entity boundaries and character references.
I assume most users of normalize-text will want consecutive text to be
interpreted in the logical view of the document, rather than the
lexical view. Otherwise the DocumentHandler will receive different
events in these two cases:
A problematic case.
A <![CDATA[problematic]]> case.
which is rather fragile, and this behaviour should be avoided, IMHO.
So basically the problem is that normalize-text and LexicalHandler
don't go well together. You can have one, but not both at the same
time, unless the driver changes it's behaviour. In other words, this
seems to require the driver to have explicit knowledge about
- reject normalize-text true if a LexicalHandler has been registered,
and reject LexicalHandler registration if normalize-text has been set
- make normalize-text have a logical interpretation by default, and
switch to lexical if a LexicalHandler has been registered
- make normalize-text always have a lexical interpretation
- have separate normalize-text-logical and normalize-text-lexical
events, with reject-behaviour for the first
xml-dev: A list for W3C XML Developers. To post, mailto:firstname.lastname@example.org
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:email@example.com the following message;
To subscribe to the digests, mailto:firstname.lastname@example.org the following message;
List coordinator, Henry Rzepa (mailto:email@example.com)