Lists Home |
Date Index |
From: "Richard Tobin" <firstname.lastname@example.org>
> The original 1.0 text said:
> To simplify the tasks of applications, wherever an external parsed
> entity or the literal entity value of an internal parsed entity
> contains either the literal two-character sequence "#xD#xA" or a
> standalone literal #xD, an XML processor must pass to the
> application the single character #xA. (This behavior can
> conveniently be produced by normalizing all line breaks to #xA on
> input, before parsing.)
> Unfortunately, it turns out that normalizing on input is *not*
> equivalent to the conversion described, because of the possibility of
> using character references in entities. So the spec was
> contradictory. Furthermore the normalization-on-input version is the
> one that most processor use. In an attempt to be declarative rather
> than procedural, it instead was inconsistent.
I think the trouble here is the phrase "on input, before parsing".
It should be clarified whether this means on DTD parsing
or on entity parsing. If it means the latter, what happens to the
normal text of the document entity, or is that counted as an "external"
The original decision on LF seems to be at
and Liam Quinn explicitly immediately raised the subject of the significance
of character references at