OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   CONSTANT_Utf8_info [Re: [xml-dev] XML 1.1 grinds to halt?

[ Lists Home | Date Index | Thread Index ]


when one considers java's implementation-specific 8-bit external string
encoding, one should keep its purpose[1] and the specified relation to java's
primitive data representations[2] in mind.

[1] http://java.sun.com/j2se/1.4.1/docs/api/java/io/DataInputStream.html
[2] http://java.sun.com/docs/books/vmspec/2nd-edition/html/ClassFile.doc.html#20080


Miles Sabin wrote:
> 
> Elliotte Rusty Harold wrote,
> > At 7:42 AM -0700 4/29/03, Tim Bray wrote:
> > > Really?  I just looked at a recent set of Java docs, and it's pretty
> > > clear that a Java char isn't really a character, it's a UTF-16
> > > codepoint, and the semantics of String are wrong for non-BMP
> > > characters, and that the attempt at UTF-8 support remains pretty
> > > laughably nonstandard and wrong.  I'd be *delighted* to hear that
> > > I'm looking at wrong/obsolete docs.  Pointers anyone? -Tim
> >
> > Unfortunately, you're more than half right. The InputStreamReader and
> > OutputStreamWriter classes do handle UTF-8 correctly. The readUTF and
> > writeUTF methods in DataInputStream/DataOutputStream don't. This
> > wouldn't be a problem if they were simply called readString/
> > writeString instead.
> 
> Yup, that's right ... for all intents and purposes, readUTF and writeUTF
> should be treated as specifying a non-standard encoding solely for the
> use of Java RMI.
>




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS