OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Java/Unicode brain damage



Elliotte Rusty Harold wrote,
> The Java way to handle this is to stop thinking of a Java char as 
> representing a Unicode character. It doesn't. A Java char represents 
> a UTF-16 code point, which may be a surrogate. The public API to 
> java.lang.String is essentially a UTF-16 API. For example, the 
> length() method of a string does not return the number of Unicode 
> characters in the string. Rather it returns the number of UTF-16 
> code points.

This is correct, but not yet officially documented in the Java
Language Specification. It got hammered out during the development
of the java.nio spec.

Cheers,


Miles

-- 
Miles Sabin                                     InterX
Internet Systems Architect                      27 Great West Road
+44 (0)20 8817 4030                             Middx, TW8 9AS, UK
msabin@interx.com                               http://www.interx.com/