OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gag me with a blunt …



At 14:10 16/03/01, you wrote:

>Hmmm, I wonder if current perl includes U+0085 in what
>matches \s?  Etc.....


All Unicode separator characters are expected to be matched by \s eventually,
while searching in UTF-8 strings (Camel III, p. 168).

Perl 5.6.0 doesn't include U+0085 in \s yet.


>Also, unlike (almost?) all the other XML errata, changing this
>would actively break pretty well every deployed piece of XML
>software in the world.  -Tim


This is not an error in the XML 1.0 spec, IMHO. Apparently, U+0085 was
assigned in Unicode 3.0, and XML 1.0 is based on Unicode 2.0.

XML 1.0 could not possibly comply in 1998 with a standard published in 2000.

The difficult question is if any change in Unicode should trigger an
instantaneous XML revision, or not. IBM thinks it should.

Unfortunately, if U+0085 is included as whitespace in the XML spec, it won't be
XML 1.0 anymore.