Lists Home |
Date Index |
when one considers java's implementation-specific 8-bit external string
encoding, one should keep its purpose and the specified relation to java's
primitive data representations in mind.
Miles Sabin wrote:
> Elliotte Rusty Harold wrote,
> > At 7:42 AM -0700 4/29/03, Tim Bray wrote:
> > > Really? I just looked at a recent set of Java docs, and it's pretty
> > > clear that a Java char isn't really a character, it's a UTF-16
> > > codepoint, and the semantics of String are wrong for non-BMP
> > > characters, and that the attempt at UTF-8 support remains pretty
> > > laughably nonstandard and wrong. I'd be *delighted* to hear that
> > > I'm looking at wrong/obsolete docs. Pointers anyone? -Tim
> > Unfortunately, you're more than half right. The InputStreamReader and
> > OutputStreamWriter classes do handle UTF-8 correctly. The readUTF and
> > writeUTF methods in DataInputStream/DataOutputStream don't. This
> > wouldn't be a problem if they were simply called readString/
> > writeString instead.
> Yup, that's right ... for all intents and purposes, readUTF and writeUTF
> should be treated as specifying a non-standard encoding solely for the
> use of Java RMI.