[
Lists Home |
Date Index |
Thread Index
]
Alessandro Triglia scripsit:
> Therefore at the very heart of your proposal is a re-interpretation trick of
> bit patterns between UTF-8 on one side and UTF-8+names on the other side.
Absolutely. I didn't say it wasn't a hack; it is a hack. I merely said
that it was a hack that wasn't only useful for people using 8-bit
character sets. Even if you are doing Ethiopian, and Unicode is the
only coded character set you'll ever have, names are still Good Things.
> Indeed, if one uses UTF-8+names just as an encoding of Unicode (with no
> re-interpretation trick), no human user will ever see those things.
> All that humans will see is some displayable form of the NON-BREAK SPACE
> character, which happened to be encoded as 0x26 0x6E 0x62 0x73 0x70 0x3B
> rather than as 0xNN1 0xNN2 (the two bit patterns being equivalent).
Absolutely. Which is why I'm not worried about how to serialize internal
Unicode as UTF-8+names; no program but an editor (which always has
special considerations of how faithful it needs to be to the input)
has to concern itself with that.
--
Do NOT stray from the path! John Cowan <jcowan@reutershealth.com>
--Gandalf http://www.ccil.org/~cowan
|