OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Re: [Sax-devel] SAX2 r2 ... last call!

[ Lists Home | Date Index | Thread Index ]

On Thu, 2002-01-10 at 16:24, David Brownell wrote:
> > It might be worth noting the current discussion on xml-dev (or content
> > thereof) regarding surrogate pairs, as SAX relies on the Java char and
> > String constructs throughout.
> 
> I'll catch up on that, but my advice on that point is unlikely to
> change.  As I've pointed out in an upcoming O'Reilly book
> (you might have heard about it, called "SAX2" ... ;-) surrogate
> pairs aren't the only place that a Java "char" doesn't match
> a "character" ... there are also composed characters to
> worry about, even in the absence of surrogate pairs.

Sure thing, all advertising for our joint projects aside...

I just suspect the point's worth making a little more strongly, as so
many of us have been brainwashed to think Java char=Unicode character. 
Surrogate pairs whacked me a lot harder over the head than I thought,
and Java doesn't seem to take note.

> Point is that anyone working at the "character" level MUST
> NOT ASSUME that such characters consist only of a single
> Java "char" value.  And that'd be true even if "char" were
> to make an incompatible change, and acquire a few extra
> bits at the left so that surrogates could in some cases be
> eliminated.

So could the paragraph above appear in the documentation somewhere?  I
think that would take of all my concerns.
 
-- 
Simon St.Laurent
Ring around the content, a pocket full of brackets
Errors, errors, all fall down!
http://simonstl.com





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS