[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Java/Unicode brain damage
- From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
- To: xml-dev@lists.xml.org
- Date: Thu, 26 Jul 2001 13:35:09 -0400
At 10:14 AM -0700 7/26/01, Benjamin Franz wrote:
>I'm being dense today. When you say 'UTF-16 units' do you mean that in
>Java a single character in the surrogate ranges may consist of (correctly
>IMHO) a _complete_ 32-bit surrogate pair or (dain bramagedly) of the
>individual 'halfs' of the pair (thus making a single character into two
>individual 'units' of 16-bits each)?
The latter
>If the latter, the Java's handling of
>Unicode is broken-as-designed and must be fixed (most likely via
>deprecation of the existing String in favor of a completely new string
>type for the sake of backwards compatibility with already deployed apps).
>
It's worse. It's not just the String class. It's the char primitive data type which is much harder to change precisely because it's not a class.
In 20-20 hindsight, there probably never should have been a char type in the first place, and all APIs should have been designed to work with String and Character objects instead.
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| The XML Bible, 2nd Edition (Hungry Minds, 2001) |
| http://www.ibiblio.org/xml/books/bible2/ |
| http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
| Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/ |
+----------------------------------+---------------------------------+