Re: [xml-dev] [Summary] UTF-8 Question: e with acute accent should requ

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

Re: [xml-dev] [Summary] UTF-8 Question: e with acute accent should require two bytes, right?

From: richard@inf.ed.ac.uk (Richard Tobin)
To: xml-dev@lists.xml.org
Date: Fri, 28 Sep 2007 21:24:48 +0100 (BST)

In article <003e01c801fc$9df86ff0$8901a8c0@aldebaran> you write:

>It is not correct to say that a Unicode character can be either an "ASCII
>character" or a "non-ASCII character".  It is better to say that some
>Unicode characters (those with codes below 128) have a corresponding
>character in ASCII.

On what do you base this assertion?  Why do you think the ASCII
characters are not the same characters that appear in Unicode?  Surely
ASCII and Unicode are sets and encodings of pre-existing characters.
Would you say that 2007 and MMVII are different numbers?

>For example, your sentence "Most of its characters are ASCII, but there is
>one non-ASCII character, the � character" is not correct.  A more precise
>statement would be, "Most of its characters have a corresponding character
>in the ASCII character set, but there is one character, the � character
>(officially named "LATIN SMALL LETTER E WITH ACUTE", code 233), which has no
>corresponding character in ASCII."

That seems like needless obscurity.

-- Richard
-- 
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Follow-Ups:
- RE: [xml-dev] [Summary] UTF-8 Question: e with acute accent should require two bytes, right?
  - From: "Alessandro Triglia" <sandro@mclink.it>

References:
- RE: [xml-dev] [Summary] UTF-8 Question: e with acute accent should require two bytes, right?
  - From: "Alessandro Triglia" <sandro@mclink.it>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]