OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] UTF-8+names

[ Lists Home | Date Index | Thread Index ]

> Interesting idea and a neat hack. If I'm reading this write, though, 
> it would require writing < in XML as &< and so forth for other 
> genuine entity and character references. 

Actually it says:

In UTF-8+names, the sequence consisting of an "&", a character string,
and a ";" is called a "replacement". The characters contained between
the "&" and the ";" are called the "replacement name" and the Unicode
character sequence which is represented is called the "replacement
value."

and then says:

For replacements whose names are not given a replacement value by this
specification, the replacement value is identical to the replacement
name. For example, the replacement "&U2;" represents the Unicode
character sequence of length 4 containing the characters U+0026
AMPERSAND, U+0055 LATIN CAPITAL LETTER U, U+0032 DIGIT TWO, and U+003B
SEMICOLON.

The two sentences here are in conflict. The rule tells you thatt the
replacement value for < is "LT", while the example suggests it is
"<".

(Another observation on this rule: it means that the set of names that
is recognized is frozen for all time, it can never be extended.)

I think you would have to write < as &&;lt; If you believe the
example rather than the rule above is correct, you could also write it
as &<; or as <

Either way, the thousands of poor users who are already badly confused
about entity references are going to become even more confused.

Michael Kay





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS