[
Lists Home |
Date Index |
Thread Index
]
> Interesting idea and a neat hack. If I'm reading this write, though,
> it would require writing < in XML as &< and so forth for other
> genuine entity and character references.
Actually it says:
In UTF-8+names, the sequence consisting of an "&", a character string,
and a ";" is called a "replacement". The characters contained between
the "&" and the ";" are called the "replacement name" and the Unicode
character sequence which is represented is called the "replacement
value."
and then says:
For replacements whose names are not given a replacement value by this
specification, the replacement value is identical to the replacement
name. For example, the replacement "&U2;" represents the Unicode
character sequence of length 4 containing the characters U+0026
AMPERSAND, U+0055 LATIN CAPITAL LETTER U, U+0032 DIGIT TWO, and U+003B
SEMICOLON.
The two sentences here are in conflict. The rule tells you thatt the
replacement value for < is "LT", while the example suggests it is
"<".
(Another observation on this rule: it means that the set of names that
is recognized is frozen for all time, it can never be extended.)
I think you would have to write < as &&;lt; If you believe the
example rather than the rule above is correct, you could also write it
as &<; or as <
Either way, the thousands of poor users who are already badly confused
about entity references are going to become even more confused.
Michael Kay
|