OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Genx

[ Lists Home | Date Index | Thread Index ]


jcowan wrote:

> I have argued privately that wchar_t is in fact the Right Thing here
> despite its variability in size (UTF-32 on Unix platforms,

Depends on the flavor of Unix; I've seen Unices with
sizeof(wchar_t) == 1, 2, and 4.

Also, few of them even state explicitly what encoding(s) are 
used for multibyte or wide characters.  EUC and Shift-JIS 
are the only multibyte encodings I've seen explicitly mentioned;
most manpages don't even say that much.

> UTF-16 on
> Windows), because it makes genx compatible with both standardized and
> non-standardized facilities, most especially "..."L strings.  Some
> conditional logic will be needed to interpret the input as UTF-16 or
> UTF-32, which can be based on sizeof(wchar_t).

It might not be either one.  On most Unices, the interpretation
of wchar_t characters depends on how the LANG and LC_* environment
variables are set (determined at runtime!)


> Hypothetical platforms
> where sizeof(wchar_t) == 1 can be neglected.

These aren't hypothetical -- I've seen one!  It's now defunct,
but still...  (sizeof(wchar_t) == 1 is actually a pretty good
design choice for certain types of systems.)


--Joe English

  jenglish@flightlab.com




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS