[
Lists Home |
Date Index |
Thread Index
]
jcowan wrote:
> I have argued privately that wchar_t is in fact the Right Thing here
> despite its variability in size (UTF-32 on Unix platforms,
Depends on the flavor of Unix; I've seen Unices with
sizeof(wchar_t) == 1, 2, and 4.
Also, few of them even state explicitly what encoding(s) are
used for multibyte or wide characters. EUC and Shift-JIS
are the only multibyte encodings I've seen explicitly mentioned;
most manpages don't even say that much.
> UTF-16 on
> Windows), because it makes genx compatible with both standardized and
> non-standardized facilities, most especially "..."L strings. Some
> conditional logic will be needed to interpret the input as UTF-16 or
> UTF-32, which can be based on sizeof(wchar_t).
It might not be either one. On most Unices, the interpretation
of wchar_t characters depends on how the LANG and LC_* environment
variables are set (determined at runtime!)
> Hypothetical platforms
> where sizeof(wchar_t) == 1 can be neglected.
These aren't hypothetical -- I've seen one! It's now defunct,
but still... (sizeof(wchar_t) == 1 is actually a pretty good
design choice for certain types of systems.)
--Joe English
jenglish@flightlab.com
|