[
Lists Home |
Date Index |
Thread Index
]
David Brownell wrote:
> Not if you go by what most systems do with those codes; what I've
> seen in practice is that those codes will map to U+0080..U+009F.
In fact, no. Most systems are Windows systems, and map byte 80
to U+20A0, and so on.
> Some ISO-8859-1 spec addendum would be interesting, since
> that's where those were defined (prior to importing to Unicode).
8859-x no more defines the control characters than Unicode does.
The relevant spec is ISO 6429:1992, ISO-IR-77. You can see a slightly
older version of this at
http://www.itscj.ipsj.or.jp/ISO-IR/077.pdf
In principle, one of the other half-dozen ISO-IR C1 control character
sets might be in use in Unicode plain text, but they are all even more
special-purpose than 6429.
> The rule of thumb being that one adds a high order zero byte to
> the ISO-8859-1 code points ("bytes") and gets Unicode.
True as far as it goes, but that does not mean that U+0080 through
U+009F are in ISO-8859-1.
--
Not to perambulate || John Cowan <jcowan@reutershealth.com>
the corridors || http://www.reutershealth.com
during the hours of repose || http://www.ccil.org/~cowan
in the boots of ascension. \\ Sign in Austrian ski-resort hotel
|