On restriction to UTF-8 (16 if we insist, but really do folks store *files* as UTF-16?):
Yes. And you would too if you had to deal with DBCS locales and didn't work for a DASD company ;)
is this really a problem for non-western languages? My impression was that it encoded them fine. I admit it's been many years since I did i18n for a living (back then it was all SJIS and EUC), but I would've thought CJK folks were much happier to have put that all behind them.
It does encode them fine, but that doesn't mean folks universally embrace it. There is a lot of inertia in JIS, EUC, BIG5, etc. and the forces are as much social as technological (Unicode also introduces its own quirks, and many like their older, warm and fuzzy quirks better). And we all know better than to try to sledgehammer social issues with technological "fixes", right?
And yet I disagree with Amy's skepticism UTF-8 & UTF-16 limitations. The reason is that in this case I do not think we'd be the ones wielding the hammer. I suspect that Web conventions are already clearing the path for Unicode, but there is precious little evidence behind that suspicion, which is why I'd love to hear from someone who has been in these wars first-hand. Is the tide turning, or is Unicode likely to continue to be a non-starter in some of the world's major locales for the forseeable future?