[
Lists Home |
Date Index |
Thread Index
]
Michael Kay scripsit:
> You could do it without changing the definition of well-formedness by
> saying that the set of characters considered to be whitespace, and
> normalized as such, is a property of the encoding.
Fine and dandy for EBCDIC, but not so good for Latin-1 as used on mainframes,
where 0x85 = NEL.
Actually, not so good for EBCDIC either, because it means that each of the
dozens of EBCDIC code pages has to exist in two flavors, a native flavor
where 0x15 encodes U+0085, and an XML flavor where 0x15 encodes U+000A.
This is more or less what FTP software has to do, and it's ugly.
--
John Cowan <jcowan@reutershealth.com>
http://www.reutershealth.com http://www.ccil.org/~cowan
.e'osai ko sarji la lojban.
Please support Lojban! http://www.lojban.org
|