[
Lists Home |
Date Index |
Thread Index
]
Tim Bray wrote:
> James Clark wrote:
>
> > But with +names you don't want to work at the encoding level. For
> > example, if you have a ü in your text file, that will be
> two bytes in
> > UTF-8+names, but you would want to work with it as a single
> character.
> > To edit a UTF-8+names text file, you need to make your text editor
> > treat it as if it were encoded in UTF-8. In other words, to make
> > things work you have to edit it in the wrong encoding.
> This will be
> > extremely confusing to users.
>
> I'm not sure I agree. In UTF-8+names, ü could show up either
> as itself
> as ü
The point is how you make it show up as ü
You normally don't see on the screen the bits and bytes of the encoding of a character, you see some display form of the encoded character. & u u m l ; is the UTF-8 re-interpretation of a UTF-8+names bit pattern. If this reinterpretation doesn't take place, the human user will not see & u u m l ; on her screen - she will see some display form of the LATIN U WITH DIAERESIS character.
Editors would have to be modified - and in ways that affect their processing model - to be able to handle this.
Alessandro
|