xml-dev - RE: [xml-dev] UTF-8+names

RE: [xml-dev] UTF-8+names

[ Lists Home | Date Index | Thread Index ]

To: "'Elliotte Rusty Harold'" <elharo@metalab.unc.edu>,"'Tim Bray'" <tbray@textuality.com>,"'XML Dev'" <xml-dev@lists.xml.org>
Subject: RE: [xml-dev] UTF-8+names
From: "Michael Kay" <michael.h.kay@ntlworld.com>
Date: Sun, 19 Oct 2003 08:20:47 +0100
Importance: Normal
In-reply-to: <p06002001bbb7431b0313@[192.168.254.4]>
Reply-to: <michael.h.kay@ntlworld.com>

> Interesting idea and a neat hack. If I'm reading this write, though, 
> it would require writing &lt; in XML as &&lt; and so forth for other 
> genuine entity and character references. 

Actually it says:

In UTF-8+names, the sequence consisting of an "&", a character string,
and a ";" is called a "replacement". The characters contained between
the "&" and the ";" are called the "replacement name" and the Unicode
character sequence which is represented is called the "replacement
value."

and then says:

For replacements whose names are not given a replacement value by this
specification, the replacement value is identical to the replacement
name. For example, the replacement "&U2;" represents the Unicode
character sequence of length 4 containing the characters U+0026
AMPERSAND, U+0055 LATIN CAPITAL LETTER U, U+0032 DIGIT TWO, and U+003B
SEMICOLON.

The two sentences here are in conflict. The rule tells you thatt the
replacement value for &LT; is "LT", while the example suggests it is
"&LT;".

(Another observation on this rule: it means that the set of names that
is recognized is frozen for all time, it can never be extended.)

I think you would have to write &lt; as &&;lt; If you believe the
example rather than the rule above is correct, you could also write it
as &&#x3c;; or as &#x3C;

Either way, the thousands of poor users who are already badly confused
about entity references are going to become even more confused.

Michael Kay

Follow-Ups:
- Re: [xml-dev] UTF-8+names
  - From: Tim Bray <tbray@textuality.com>

References:
- Re: [xml-dev] UTF-8+names
  - From: Elliotte Rusty Harold <elharo@metalab.unc.edu>

Prev by Date: RE: [xml-dev] UTF-8+names
Next by Date: RE: [xml-dev] UTF-8+names
Previous by thread: Re: [xml-dev] UTF-8+names
Next by thread: Re: [xml-dev] UTF-8+names
Index(es):
- Date
- Thread