OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Use of UTF-8 and UTF-16

[ Lists Home | Date Index | Thread Index ]

In article <4368C783.3070008@sophia.inria.fr> you write:
>UTF-8 uses 6 bytes for ISO/IEC 10646
>UTF-8 uses 4 bytes for Unicode

UTF-8 would need 6 bytes to represent code points up to 2^31-1, but
the Unicode codespace only goes to 10ffff, so only 4 bytes are needed
for Unicode characters.  10ffff is (presumably not coincidentally) the
limit of what UTF-16 can represent using surrogates.

-- Richard


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS