Lists Home |
Date Index |
> sense is it brain damaged for an EBCDIC editor to insert NEL at the end of a
> line? The XML 1.1 proposal to treat CR, LF, CRLF, and NEL equally on input
> and in interpretation, as Unicode prescribes, seems quite sensible to me.
> Nevertheless, I will very happily concede this whole point about XML 1.1 and
> Unicode NEL if someone can explain why mainframe/EBCDIC conventions used for
> 50 years are somehow less "standard" than Unix/DOS/Windows conventions used
> for 30 years.
But any file that's in ebcdic encoding will anyway have to have
an encoding declaration and the parser, if it understands ebcdic at all,
will have to map everything to unicode, so there's nothing stopping
ebcdic new-line being used with XML 1.0 as white space in ebcdic encoded
XML files so long as the parsers map NEL to #10. It is unnatural to
allow #85 as white space in XML as (currently at least) it isn't as far
as I know an end of line character in any ascii/unicode based system.
So it is completely unlike the situation with #10 and #13.
At least NEL was flagged as being considered in the original
requirements doc and is a single byte in utf8. The effects of allowing
#2028 are worse and no real justification has been offered as to why
anyone thought to add it..
This message has been checked for all known viruses by Star Internet
delivered through the MessageLabs Virus Scanning Service. For further
information visit http://www.star.net.uk/stats.asp or alternatively call
Star Internet for details on the Virus Scanning Service.