[
Lists Home |
Date Index |
Thread Index
]
> > C and C++ on the Windows platform *are* UTF-16 centric. If you put
> > a Gothic character into a "..."L string, for example
>
> So you're saying that it would be satisfactory for genx to infer that
if
>
> sizeof(wchar_t) == 2
>
> then the values are UTF16 coded units? -Tim
Just as long as you are aware that a wchar_t on these platforms will not
necessarily map directly to a full UTF-16 character. So string
manipulation routines that use length and character offset will be based
on number of 16-bit units rather than number of utf-16 characters (this
will cause incorrect results if dealing with surrogate characters).
However, it is the status quo in most of the world other than Java, so
presumably the developer would be aware of this.
|