OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gag me with a blunt …



At 01:46 PM 16/03/01 +0700, James Clark wrote:
>> Has anyone seen this thing?
>>   http://www.w3.org/TR/newline
>> I have a horrid suspicion that it's actually correct.
>
>I'm not convinced.  The XML spec says that Unicode character #x85 is not
>a whitespace characters.  It appears from the Note that EBCDIC text
>files on IBM mainframes represent newline by a byte with code 0x85. The
>solution appears obvious to me: the EBCDIC encoding table used by the
>XML parser should map byte 0x85 to Unicode character 0xA.

This feels much better.  And upon reflection, the thought of 
XML files which have been through a mainframe starting to
percolate around the system with U+0085 embedded inside
start tags makes me nervous; I can see a lot of people
sitting in front of windows and unix boxes looking baffled
because their existing program broke in response to a
human-invisible stimulus.

Hmmm, I wonder if current perl includes U+0085 in what
matches \s?  Etc..... 

Also, unlike (almost?) all the other XML errata, changing this 
would actively break pretty well every deployed piece of XML
software in the world.  -Tim