[
Lists Home |
Date Index |
Thread Index
]
On Thu, 2002-01-10 at 14:23, John Cowan wrote:
> Simon St.Laurent wrote:
> > Surrogate pairs are very tricky critters that seem to me to require
> > substantially more programming care than any other aspect of Unicode,
> > and I suspect that developers will be cursing them for a long time to come.
>
> When you are using a language that hard-codes "char" to mean "16 bits",
> then yes.
It beats 8 bits, certainly, but...
Just in curiosity, what languages support 32-bit characters? (I tend to
live in Java and various scripting languages.)
> > The testing I've been able to perform so far is pretty crude stuff. If
> > anyone with more experience in Unicode or better tools for creating test
> > documents has time to explore this work, I'd greatly appreciate it. As
> > XML 1.0 parsers already perform some of this testing, creating tests
> > that go outside of those bounds and reach gorille (not just the parser)
> > is tricky.
>
>
> IIRC, Aelfred (not Aelfred2) doesn't actually check these things.
>
> A JAXP wrapper for it might be useful.
Yet another good project. I can probably just plug it into the SAX2
adapter classes, when I find more time for this.
--
Simon St.Laurent
Ring around the content, a pocket full of brackets
Errors, errors, all fall down!
http://simonstl.com
|