OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Java/Unicode brain damage



At 10:14 AM -0700 7/26/01, Benjamin Franz wrote:


>I'm being dense today. When you say 'UTF-16 units' do you mean that in
>Java a single character in the surrogate ranges may consist of (correctly
>IMHO) a _complete_ 32-bit surrogate pair or (dain bramagedly) of the
>individual 'halfs' of the pair (thus making a single character into two
>individual 'units' of 16-bits each)?

The latter

>If the latter, the Java's handling of
>Unicode is broken-as-designed and must be fixed (most likely via
>deprecation of the existing String in favor of a completely new string
>type for the sake of backwards compatibility with already deployed apps).
>

It's worse. It's not just the String class. It's the char primitive data type which is much harder to change precisely because it's not a class. 

In 20-20 hindsight, there probably never should have been a char type in the first place, and all APIs should have been designed to work with String and Character objects instead. 
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+ 
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|              http://www.ibiblio.org/xml/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      | 
|  Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/     |
+----------------------------------+---------------------------------+