[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Gag me with a blunt …
- From: John Cowan <cowan@mercury.ccil.org>
- To: Susan Malaika <malaika@us.ibm.com>
- Date: Mon, 19 Mar 2001 08:09:12 -0500 (EST)
Susan Malaika scripsit:
> OS390 [NEL] is like Apple Macintosh [CR] which is like UNIX [LF] so
> why not have the same Unicode representation for all 3? :-)
Round-trippability. Unix and MS-DOS both use ASCII, but have different
line-end conventions. We do not want to have separate conversion
tables (to and from Unicode) for different operating systems' notion
of plain text. (Bad enough we have to handle different charsets/
code pages, but at least they can be OS-independent.)
So there is a Unicode equivalent for every control character
(33 in C0, 32 in C1, 64 in EBCDIC), and line-end differences
have to be adjusted above that level.
I.e. at the XML level.
--
John Cowan cowan@ccil.org
One art/there is/no less/no more/All things/to do/with sparks/galore
--Douglas Hofstadter