[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] UTF-8 Question: e with acute accent should requiretwo bytes, right?
- From: Philippe Poulard <philippe.poulard@sophia.inria.fr>
- To: "Costello, Roger L." <costello@mitre.org>
- Date: Fri, 28 Sep 2007 17:24:53 +0200
Costello, Roger L. a écrit :
> Hi Folks,
>
> Consider this element:
>
> <title>My Resumé</title>
>
> Notice: é (the character "e" with an acute accent). It is U-00E9
>
> Since its code point is greater than U+0080, it requires more than one
> byte.
>
> Hex E9 = Decimal 233. This has the binary: 11101001
>
> I believe that it is encoded in UTF-8 as two bytes:
>
> 11000011 10101001
>
> These bytes correspond to hex C3 and hex A9.
>
> Thus, é should be encoded in UTF-8 as:
>
> C3A9
>
> The code points of the other characters (My Resum) are all less than
> U-0080, and so the UTF-8 encoding of those characters should be only
> one byte.
>
> So, this is what I believe should be the bytes:
>
> M y R e s u m é
> 4D79 2052 6573 756D C3A9
>
> Do you agree?
If I believe...
http://people.w3.org/rishida/scripts/uniview/conversion.php
...yes !
>
> However, when I view the bytes in my hex editor I get this:
>
> M y R e s u m é
> 4D79 2052 6573 756D E9
>
> Notice that é uses only one byte.
>
> Something is wrong. Here's what I think may be wrong:
> - the editor that I am using to display the hex values is displaying
> the code points and not the hex values. However, I have now tried two
> editors, and they both display the same thing (E9). So perhaps the
> editor isn't the problem. Perhaps I'm the problem, and am
> misunderstanding something. Help!
Either E9 stand for U-00E9, or your editor doesn't encode in UTF-8 but
rather in iso-8859-x or windows-cpxxx or whatever that encode é with E9
--
Cordialement,
///
(. .)
--------ooO--(_)--Ooo--------
| Philippe Poulard |
-----------------------------
http://reflex.gforge.inria.fr/
Have the RefleX !
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]