OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] whitespace in 1.1

[ Lists Home | Date Index | Thread Index ]

From: "Richard Tobin" <richard@cogsci.ed.ac.uk>

> The original 1.0 text said:
>   To simplify the tasks of applications, wherever an external parsed
>   entity or the literal entity value of an internal parsed entity
>   contains either the literal two-character sequence "#xD#xA" or a
>   standalone literal #xD, an XML processor must pass to the
>   application the single character #xA. (This behavior can
>   conveniently be produced by normalizing all line breaks to #xA on
>   input, before parsing.)
> Unfortunately, it turns out that normalizing on input is *not*
> equivalent to the conversion described, because of the possibility of
> using character references in entities.  So the spec was
> contradictory.  Furthermore the normalization-on-input version is the
> one that most processor use.  In an attempt to be declarative rather
> than procedural, it instead was inconsistent.

I think the trouble here is the phrase "on input, before parsing".
It should be clarified whether this means on DTD parsing
or on entity parsing. If it means the latter, what happens to the
normal text of the document entity, or is that counted as an "external"
parsed entity?

The original decision on LF seems to be at
and Liam Quinn explicitly immediately raised the subject of the significance
of character references at

Rick Jelliffe


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS