OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Re: [Sax-devel] SAX2 r2 ... last call!

[ Lists Home | Date Index | Thread Index ]

> I just suspect the point's worth making a little more strongly, as so
> many of us have been brainwashed to think Java char=Unicode character. 
> Surrogate pairs whacked me a lot harder over the head than I thought,
> and Java doesn't seem to take note.

True for most folk.  XML made me get my hands dirty with
I18N stuff, and that one took a while for me to grok.  I don't
think it'll be intuitive to most folk, who've rarely had to look
at such I18N issues.

> > Point is that anyone working at the "character" level MUST
> > NOT ASSUME that such characters consist only of a single
> > Java "char" value.  And that'd be true even if "char" were
> > to make an incompatible change, and acquire a few extra
> > bits at the left so that surrogates could in some cases be
> > eliminated.
> So could the paragraph above appear in the documentation somewhere?  I
> think that would take of all my concerns.

Yes, I was thinking of doing that.  After I imbibe the other thread
a bit more deeply, to make sure I pick up any other details.  That
should make it into the SAX2 r2 ContentHandler docs, and maybe
also LexicalHandler.comment() if I get ambitious.

- Dave


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS