xml-dev - Re: [xml-dev] Genx

Re: [xml-dev] Genx

[ Lists Home | Date Index | Thread Index ]

To: 'xml-dev@lists.xml.org ' <xml-dev@lists.xml.org>
Subject: Re: [xml-dev] Genx
From: Tim Bray <tbray@textuality.com>
Date: Wed, 21 Jan 2004 13:07:58 -0800
In-reply-to: <20040121195737.GH30165@skunk.reutershealth.com>
References: <C85FBC6B-4B14-11D8-905C-000A95A51C9E@textuality.com> <200401211840.i0LIeLo07627@dragon.office.flightlab.com> <20040121195737.GH30165@skunk.reutershealth.com>

On Jan 21, 2004, at 11:57 AM, jcowan@reutershealth.com wrote:

>> The 'codePoint' typedef may be problematic:
>>
>>     // Unicode code points (4-byte int on most systems)
>>     typedef wchar_t codePoint;
>>
>> ...
> I have argued privately that wchar_t is in fact the Right Thing here
> despite its variability in size (UTF-32 on Unix platforms, UTF-16 on
> Windows), because it makes genx compatible with both standardized and
> non-standardized facilities, most especially "..."L strings.  Some
> conditional logic will be needed to interpret the input as UTF-16 or
> UTF-32, which can be based on sizeof(wchar_t).  Hypothetical platforms
> where sizeof(wchar_t) == 1 can be neglected.

Almost.  How about we leave it as wchar_t, but *not* UTF-16, so a value  
that's in a surrogate block is an error.  Then we change the name from  
codePoint (which could be interpreted as meaning "UTF-16 Code Point" to  
something more explicit like

numericValueCorrespondingToAUnicodeCharacterAsInUPlusFourHexDigitsIsThat 
Clear

John Cowan has suggested that "codeUnit" might be a good name, I'd be  
inclined to "uniChar", any other ideas?

If someone wants to put a generic UTF-16 processor on top of genx, that  
would be fine.  I don't see the demand for supporting it at the input  
end of genx because the UTF-16 centric languages like Java and C# have  
decent xml-writing software already. -Tim

Follow-Ups:
- Re: [xml-dev] Genx
  - From: Marc-Antoine Parent <maparent@acm.org>
- Re: [xml-dev] Genx
  - From: jcowan@reutershealth.com
- Re: [xml-dev] Genx
  - From: Rich Salz <rsalz@datapower.com>

References:
- Genx
  - From: Tim Bray <tbray@textuality.com>
- Re: [xml-dev] Genx
  - From: Joe English <jenglish@flightlab.com>
- Re: [xml-dev] Genx
  - From: jcowan@reutershealth.com

Prev by Date: RE: [xml-dev] Sweet nostalgia
Next by Date: Re: [xml-dev] Genx
Previous by thread: Re: [xml-dev] Genx
Next by thread: Re: [xml-dev] Genx
Index(es):
- Date
- Thread