Lists Home |
Date Index |
Alessandro Triglia scripsit:
> Therefore at the very heart of your proposal is a re-interpretation trick of
> bit patterns between UTF-8 on one side and UTF-8+names on the other side.
Absolutely. I didn't say it wasn't a hack; it is a hack. I merely said
that it was a hack that wasn't only useful for people using 8-bit
character sets. Even if you are doing Ethiopian, and Unicode is the
only coded character set you'll ever have, names are still Good Things.
> Indeed, if one uses UTF-8+names just as an encoding of Unicode (with no
> re-interpretation trick), no human user will ever see those things.
> All that humans will see is some displayable form of the NON-BREAK SPACE
> character, which happened to be encoded as 0x26 0x6E 0x62 0x73 0x70 0x3B
> rather than as 0xNN1 0xNN2 (the two bit patterns being equivalent).
Absolutely. Which is why I'm not worried about how to serialize internal
Unicode as UTF-8+names; no program but an editor (which always has
special considerations of how faithful it needs to be to the input)
has to concern itself with that.
Do NOT stray from the path! John Cowan <email@example.com>