OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Genx

[ Lists Home | Date Index | Thread Index ]

> > C and C++ on the Windows platform *are* UTF-16 centric.  If you put
> > a Gothic character into a "..."L string, for example
> 
> So you're saying that it would be satisfactory for genx to infer that
if
> 
>     sizeof(wchar_t) == 2
> 
> then the values are UTF16 coded units? -Tim

Just as long as you are aware that a wchar_t on these platforms will not
necessarily map directly to a full UTF-16 character.  So string
manipulation routines that use length and character offset will be based
on number of 16-bit units rather than number of utf-16 characters (this
will cause incorrect results if dealing with surrogate characters).
However, it is the status quo in most of the world other than Java, so
presumably the developer would be aware of this.






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS