OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] XML and mainframes, yet again (was RE: [xml-dev] So me com

[ Lists Home | Date Index | Thread Index ]

At 10:42 AM -0500 12/15/01, Champion, Mike wrote:

>Unicode seems to be the operative standard here and I just don't follow the
>argument that NEL is not a "standard" newline character.  The whole point of
>XML 1.1, I thought, was to defer decisions on character semantics to
>Unicode.  I don't claim to know much about this subject, but
>http://www.unicode.org/unicode/reports/tr13/ says quite plainly: "Even if
>you know which characters represents NLF on your particular platform, on
>input and in interpretation, treat CR, LF, CRLF, and NEL the same."  In what
>sense is it brain damaged for an EBCDIC editor to insert NEL at the end of a
>line?  The XML 1.1 proposal to treat CR, LF, CRLF, and NEL equally on input
>and in interpretation, as Unicode prescribes, seems quite sensible to me.
>Nevertheless, I will very happily concede this whole point about XML 1.1 and
>Unicode NEL if someone can explain why mainframe/EBCDIC conventions used for
>50 years are somehow less "standard" than Unix/DOS/Windows conventions used
>for 30 years.

It's a standard for the only true reason anything ever is a standard: 
because people are using it; and it's more standard because many, 
many more people are using it. In 2001 the number of people actually 
using mainframe editors that  write NELs at the end of lines is 
miniscule and dropping. The number of people using text editors that 
don't handle NEL is still growing with the sales of personal 
computers of many types.

Show of hands: is anyone here actually using an IBM mainframe editor? 
I haven't worked with IBM mainframes in ten years, but even then it 
was unusual to actually use a terminal editor on the mainframe 
instead of writing on a PC and uploading the file. Mainframes have a 
lot of users still, but almost all of those users interface through 
the Web, through client software running on PCs, through mounted file 
systems that make the server look like a local disk, or through some 
other means that doesn't actually involve logging in and typing.

>Like most of us above the age of 35 or so, I have unpleasant memories of the
>days when the capital of the Evil Empire was Armonk NY rather than Redmond
>WA. If this were just something that IBM and only IBM had to fix, I wouldn't
>shed a single tear or write a line of sympathy.    If the argument is really
>about making the perpetrators pay for their brain-deadedness, consider this:
>even if IBM did "fix" their software, it would be a massive expense and
>hassle for their tens of thousands of mainframe customers to update their
>tools and software.  This must be several orders of magnitude more expensive
>for the mainframe world than for the relative handful of XML tool vendors to
>update, which they will be doing anyway if XML 1.1 comes out.

If it's a legacy system, then it isn't generating XML and it doesn't 
need to be fixed. But really, this is a speck vs. a log problem. To 
save these "tens of thousands of customers"  from the expense of 
updating their tools and software, you're willing to impose much 
larger costs on the hundreds of thousands of other people working 
with XML today.

>I hope I'm missing something obvious here: I'm going to have a hard time
>explaining to folks that XML is standards-based, language-neutral,
>platform-neutral, and vendor-neutral ... but that some standards and
>platforms and vendors are more neutral than others.

Of course some standards and platforms and vendors are more neutral 
than others. Windows has hundreds of millions of users. Like it or 
not, it's far more standard than the Mac or Unix or VM/CMS. We can 
make XML work better on all platforms, but it's ridiculous to make it 
work better on IBM mainframes at the cost of making it work worse on 
Windows, the Mac, and Unix (which is what XML 1.1 proposes doing.)

ASCII works everywhere except IBM mainframes. It's a lot more 
standard and more platform and vendor-neutral than EBCDIC. Latin-1 
works on Windows and Unix. It's less platform-neutral than ASCII but 
more platform and vendor-neutral than EBCDIC. It's more 
language-neutral than ASCII. Unicode is perhaps the most language and 
platform neutral set of all, but it when it goes beyond specifying 
character code points to actually defining the semantics of 
characters, it begins to lose some of its platform-neutrality. This 
isn't just an issue for line breaks, but also for other parts of 
Unicode like right-to-left and left-to-right markers. It's one of the 
nastier parts of Unicode.

XML should work with the standard semantics for each character. The 
standard understanding of NEL is (in rough order of actual usage):

*  The three-dot ellipsis
*  A missing glyph box
*  Latin capital letter O with diaresis
*  Many other characters

and somewhere way, way down the list, it's understood to mean a line break.

It doesn't matter what the Unicode spec or any other spec says should 
be done with this character. What matters is what software actually 
does with it. In the case of NEL, unlike, say, the letter A or a 
linefeed, there is quite a bit of disagreement about what software 
actually does with it, which is a very good reason not use it at all, 
especially when we already have two perfectly good characters that 
everyone already agrees do mean a line break.

| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|              http://www.ibiblio.org/xml/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/     |


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS