xml-dev - Re: [xml-dev] Use of UTF-8 and UTF-16

Re: [xml-dev] Use of UTF-8 and UTF-16

[ Lists Home | Date Index | Thread Index ]

To: xml-dev@lists.xml.org
Subject: Re: [xml-dev] Use of UTF-8 and UTF-16
From: richard@inf.ed.ac.uk (Richard Tobin)
Date: Wed, 2 Nov 2005 16:34:34 +0000 (GMT)
Cc:
In-reply-to: <4368C783.3070008@sophia.inria.fr>
Organization: HCRC, University of Edinburgh

In article <4368C783.3070008@sophia.inria.fr> you write:
>UTF-8 uses 6 bytes for ISO/IEC 10646
>UTF-8 uses 4 bytes for Unicode

UTF-8 would need 6 bytes to represent code points up to 2^31-1, but
the Unicode codespace only goes to 10ffff, so only 4 bytes are needed
for Unicode characters.  10ffff is (presumably not coincidentally) the
limit of what UTF-16 can represent using surrogates.

-- Richard

References:
- Re: [xml-dev] Use of UTF-8 and UTF-16
  - From: Philippe Poulard <Philippe.Poulard@sophia.inria.fr>

Prev by Date: RE: [xml-dev] RE: description of the logical or semantic structure
Next by Date: Re: [xml-dev] RE: description of the logical or semantic structure
Previous by thread: Re: [xml-dev] Use of UTF-8 and UTF-16
Next by thread: Re: [xml-dev] Use of UTF-8 and UTF-16
Index(es):
- Date
- Thread