Lists Home |
Date Index |
The more I look at the text in section 2.13 of XML 1.1, the more
confused I get. There are two things that bother me: why require that
XML documents be normalized, and does the specification require
processors to pass normalized character data to the application?
Let's start with the last one:
- clearly, documents that are not normalized are still well-formed,
so if the application is to have any guarantees here the processor
must do normalization before passing on the information,
- the text says that "XML processors must not transform the input to
be in fully normalized form." This seems to say that processors are
not allowed to do the transformation.
Apparently, the application gets the character data as it was in the
document, and is then informed that "document was normalized" or
"document was not normalized".
This brings me to the first question: why should the application have
to care? Wouldn't it be far better if the application could be certain
that an XML 1.1 processor would provide normalized character data and
to ignore the whole issue of how the document was encoded? After all,
isn't the whole purpose of *having* XML parsers to insulate
applications from worries about the lexical details of documents?
In other words, why not rewrite this so that processors are required
to normalize character data? Then the whole issue of whether or not
documents were normalized just disappears, which means that XML 1.0
and 1.1 documents will appear the same to applications with regard to
Or did I just completely misunderstand everything?
Lars Marius Garshol, Ontopian <URL: http://www.ontopia.net >
GSM: +47 98 21 55 50 <URL: http://www.garshol.priv.no >