OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Unicode and XML (was Re: [xml-dev] Remembering the origina

[ Lists Home | Date Index | Thread Index ]

On Sun, Feb 16, 2003 at 09:54:32AM -0800, Tim Bray wrote:
> Well, XML1.1 is moving in that direction.  Even given that, I think that 
> XML 1.0's approach, with a big table right in the spec saying "here are 
> the legal characters", was probably correct; I (and I'm sure many other 
> programmers) ran a perl script over the spec to extract the char parsing 
> tables.   -Tim

 I used vi regexps directly, and recorded those in the C source file :-) !

 :1,$ s/\[#x\([0-9A-Z]*\)-#x\([0-9A-Z]*\)\]/     (((c) >= 0x\1) \&\& ((c) <= 0x\2)) ||/
 :1,$ s/#x\([0-9A-Z]*\)/     ((c) == 0x\1) ||/

 of course the result was later modified a bit to speed up the test.
In order to try to turn a useless post into an useful one, did someone
tried to implement the character normalization checking of XML-1.1 ?
 I looked at the ICU sample code a few months ago and this simply scared
me mostly due to my perception of that code complexity and runtime cost.


Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS