Re: [xml-dev] [Summary] UTF-8 Question: e with acute accent should req

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

Re: [xml-dev] [Summary] UTF-8 Question: e with acute accent should require two bytes, right?

From: richard@inf.ed.ac.uk (Richard Tobin)
To: xml-dev@lists.xml.org
Date: Sun, 30 Sep 2007 00:00:04 +0100 (BST)

In article <004101c802c8$34fd7640$654d410a@aldebaran> you write:

>I don't see what "UTF-8 character" could mean other than a "(Unicode)
>character encoded in UTF-8".

It can also mean a character in the UTF-8 repertoire.  Hoever, the
UTF-8 repertoire is the same as that of Unicode, so it may well be
that people will usually use "Unicode character" for that meaning and
reserve the "UTF-8 character" for "character encoded in UTF-8".

That doesn't apply to ASCII - it's not just an encoding of some other
character set - and I would guess that "ASCII character" more often
means one in the ASCII repertoire.

There is not a universally agreed precise terminology for these
things.  But using cumbersome phrases to try and achieve precision in
a document for beginners (which is what I take it we were discussing)
is likely to instead just be confusing.

-- Richard
-- 
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

References:
- RE: [xml-dev] [Summary] UTF-8 Question: e with acute accent should require two bytes, right?
  - From: "Alessandro Triglia" <sandro@mclink.it>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]