OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Java/Unicode brain damage

Elliotte Rusty Harold wrote,
> The Java way to handle this is to stop thinking of a Java char as 
> representing a Unicode character. It doesn't. A Java char represents 
> a UTF-16 code point, which may be a surrogate. The public API to 
> java.lang.String is essentially a UTF-16 API. For example, the 
> length() method of a string does not return the number of Unicode 
> characters in the string. Rather it returns the number of UTF-16 
> code points.

This is correct, but not yet officially documented in the Java
Language Specification. It got hammered out during the development
of the java.nio spec.



Miles Sabin                                     InterX
Internet Systems Architect                      27 Great West Road
+44 (0)20 8817 4030                             Middx, TW8 9AS, UK
msabin@interx.com                               http://www.interx.com/