OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] ANN: Gorille 0.3

[ Lists Home | Date Index | Thread Index ]

At 1:29 PM -0500 1/10/02, Simon St.Laurent wrote:

>Surrogate pairs are very tricky critters that seem to me to require 
>substantially more programming care than any other aspect of 
>Unicode, and I suspect that developers will be cursing them for a 
>long time to come.

You're only having trouble because Java's char type is brain-damaged 
in that a Java char actually represents a UTF-16 code point rather 
than a Unicode character. If Java's char type were four bytes instead 
of two, or an object instead of a primitive type, none of this would 
be bothering you. Surrogate pairs are one of the things a good class 
library should hide from you.

It could be worse, though. You could be using C, and trying to decode 
UTF-8. :-)

| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|              http://www.ibiblio.org/xml/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/     |


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS