RE: Java/Unicode brain damage

At 10:14 AM -0700 7/26/01, Benjamin Franz wrote:

>I'm being dense today. When you say 'UTF-16 units' do you mean that in
>Java a single character in the surrogate ranges may consist of (correctly
>IMHO) a _complete_ 32-bit surrogate pair or (dain bramagedly) of the
>individual 'halfs' of the pair (thus making a single character into two
>individual 'units' of 16-bits each)?

The latter

>If the latter, the Java's handling of
>Unicode is broken-as-designed and must be fixed (most likely via
>deprecation of the existing String in favor of a completely new string
>type for the sake of backwards compatibility with already deployed apps).

It's worse. It's not just the String class. It's the char primitive data type which is much harder to change precisely because it's not a class. 

In 20-20 hindsight, there probably never should have been a char type in the first place, and all APIs should have been designed to work with String and Character objects instead. 

