OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Genx

[ Lists Home | Date Index | Thread Index ]

Joe English scripsit:

> The 'codePoint' typedef may be problematic:
> 
>     // Unicode code points (4-byte int on most systems)
>     typedef wchar_t codePoint;
> 
> The C standard makes no useful guarantees about
> the size or interpretation of 'wchar_t'.  On some
> systems it's identical to plain 'char', and even
> on systems where it's big enough to hold all of
> Unicode, there's no guarantee about what encoding
> the wcs* and *wcs functions use.  wchar_t should
> not be used in programs that are meant to generate 
> portable data and be portable themselves; you just 
> don't know what you're going to get.

I have argued privately that wchar_t is in fact the Right Thing here
despite its variability in size (UTF-32 on Unix platforms, UTF-16 on
Windows), because it makes genx compatible with both standardized and
non-standardized facilities, most especially "..."L strings.  Some
conditional logic will be needed to interpret the input as UTF-16 or
UTF-32, which can be based on sizeof(wchar_t).  Hypothetical platforms
where sizeof(wchar_t) == 1 can be neglected.

-- 
He made the Legislature meet at one-horse       John Cowan
tank-towns out in the alfalfa belt, so that     jcowan@reutershealth.com
hardly nobody could get there and most of       http://www.reutershealth.com
the leaders would stay home and let him go      http://www.ccil.org/~cowan
to work and do things as he pleased.    --Mencken, _Declaration of Independence_

  • References:



 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS