[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: XML Blueberry
- From: Amy Lewis <firstname.lastname@example.org>
- To: email@example.com
- Date: Fri, 22 Jun 2001 07:22:56 -0400
On Fri, Jun 22, 2001 at 10:17:28AM +0100, David Carlisle wrote:
>] The major problem is that when
>] it appears in a tag, e.g.
>] <t a1="1"
>] (where there's a NEL after the "1") then the XML processor will kick
>] this out. -Tim
>Do any files really use NEL that are encoded in utf-8 or utf-16 (or
>utf-8 subsets like ascii that don't need to be declared)?
>If all the files using NEL start
><?xml version="1.0" encoding="some-flavour-of-ebcdic"?>
>Then can't NEL be mapped to #10 (0r #13) in the non normative support
>for the ebcdic related encodings. This wouldn't require any change to XML.
At a guess, this is a new software problem rather than an old software
problem. Remember that IBM is a *big* advocate of Java, across
platforms. There's an outstanding chance that System.out.println(),
System.getProperty("line.separator") supply NEL, a perfectly valid (and
until now, probably) uncontroversial choice for line ending. It isn't
the NVT line-ending, but Unix broke that first, and Apple broke it
differently, for much the same reason that IBM settled on NEL--why use
two characters to represent one thing? (the network virtual terminal
uses CR/LF for backward compatibility with old teletypes, after all ...
all of these control characters are a pain).
Time for the shocker. Why not just remove the concept of line endings
from XML? Whitespace == whitespace (per unicode definition, and let
something like Java's Character.isWhitespace() actually work). Rather
than focussing exclusively on the IBM choice of NEL ... hmmm. Well,
no, I suppose not. CR, LF, NEL, TAB aren't space characters, per
unicode; might be able to do it by defining the S production to be
Unicode space + Unicode layout control, although that may be a slightly
wider net to cast.
Note that I'm not a mainframe person, so I'm only guessing that the NEL
issue is new-software-related. Seems reasonable, all things
Amelia A. Lewis firstname.lastname@example.org email@example.com