[
Lists Home |
Date Index |
Thread Index
]
> Elliotte Rusty Harold wrote:
>
> > It could be worse, though. You could be using C, and trying to decode
> > UTF-8. :-)
>
>
> ?? It's about 10 lines of code, and has been written lots of
> times now. Last time I needed it I couldn't find one with the
> exact buffer interface I needed so I coded it up from scratch
> sometime in the course of an afternoon and it worked first time.
> The spec is hardly unclear. And it's a set of shift/mask
> operations that are processor-friendly. You need to use a
> loop iterator rather than a for (i = 0; string[i]; i++) idiom,
> big deal.
I was about to post the same reaction.
I do want to add, though, that the last time I needed to code UTF-8 routines in C, I found the UTF-8 illustration in Tony Graham's book so lucid as to make implementation a tea-time task.
And BTW, I wigh you had not brought up wchar_t, because that dredges up memories of all the horror in wchar.h, and last thing I need is a case of saffron hives.
--
Uche Ogbuji Principal Consultant
uche.ogbuji@fourthought.com +1 303 583 9900 x 101
Fourthought, Inc. http://Fourthought.com
4735 East Walnut St, Boulder, CO 80301-2537, USA
XML strategy, XML tools (http://4Suite.org), knowledge management
|