Lists Home |
Date Index |
Alessandro Triglia wrote:
> Indeed, if one uses UTF-8+names just as an encoding of Unicode (with no
> re-interpretation trick), no human user will ever see those things.
> All that humans will see is some displayable form of the NON-BREAK SPACE
> character, which happened to be encoded as 0x26 0x6E 0x62 0x73 0x70 0x3B
> rather than as 0xNN1 0xNN2 (the two bit patterns being equivalent).
I had to read this a couple of times but now I get it. For most
encodings of Unicode I know of, if you're editing a text file, any
characters that can be displayed are displayed as themselves, not as the
underlying UTF-8 bit patterns or whatever. Characters that *can't* be
displayed show up as diamonds or squiggles or boxes. +names is
different in that sometimes a human might want to work with the encoding
not the actual Unicode characters, purely because ∯ might look
better in your file than the surface integral (U+222B) that your screen
can't display. On the other hand, since basically every screen in the
world can now display ü, you'd rather see that than ü.
Bottom line: in some applications this would be convenient. Others not.
> I am not actually proposing to add this macro functionality to Unicode, but
> I am saying that there are two places where the initial problem can be
> addressed: either at the XML level or at the Unicode level (which involves
> the displayable form). Not at the encoding level.
Bear in mind that the initial problem was the ongoing clamor from
communities of people who really want to use the ISO entity sets but
don't want to use DTDs. So far, the standards community has failed to
come up with an option that is attractive to them. +names is just a
trial balloon. My intuition disagrees with yours, the encoding level
feels like an appropriate approach to this problem. -Tim