Lists Home |
Date Index |
Elliotte Rusty Harold scripsit:
> ASCII works everywhere except IBM mainframes. It's a lot more
> standard and more platform and vendor-neutral than EBCDIC.
EBCDIC vs. ASCII is perfectly irrelevant to this discussion: mainframes
can work with ASCII files as well as EBCDIC files, but in either
case the NEL character (encoded as hex 85 in ASCII encoding or
hex 15 in EBCDIC encoding) is the native line delimiter.
> XML should work with the standard semantics for each character. The
> standard understanding of NEL is (in rough order of actual usage):
> * The three-dot ellipsis
> * A missing glyph box
> * Latin capital letter O with diaresis
> * Many other characters
Not at all. The *character* #x85 means either a line break or nothing at
all. The *hex byte* 85 has the multiple meanings you mention, because
it encodes the character #x2026 in your first case, and the character
#D6 in your third case.
In order to understand the issues, it's *critical* not to mix up
characters and bytes.
John Cowan http://www.ccil.org/~cowan email@example.com
Please leave your values | Check your assumptions. In fact,
at the front desk. | check your assumptions at the door.
--sign in Paris hotel | --Miles Vorkosigan