xml-dev - Re: [xml-dev] UTF-8+names

Re: [xml-dev] UTF-8+names

[ Lists Home | Date Index | Thread Index ]

To: Alessandro Triglia <sandro@mclink.it>
Subject: Re: [xml-dev] UTF-8+names
From: John Cowan <cowan@mercury.ccil.org>
Date: Sun, 19 Oct 2003 14:30:12 -0400
Cc: xml-dev@lists.xml.org
In-reply-to: <000c01c3966d$7fbb9f70$42a7c044@aldebaran>
References: <20031019172255.GJ20059@mercury.ccil.org> <000c01c3966d$7fbb9f70$42a7c044@aldebaran>
User-agent: Mutt/1.3.28i

Alessandro Triglia scripsit:

> Therefore at the very heart of your proposal is a re-interpretation trick of
> bit patterns between UTF-8 on one side and UTF-8+names on the other side.

Absolutely.  I didn't say it wasn't a hack; it is a hack.  I merely said
that it was a hack that wasn't only useful for people using 8-bit
character sets.  Even if you are doing Ethiopian, and Unicode is the
only coded character set you'll ever have, names are still Good Things.

> Indeed, if one uses UTF-8+names just as an encoding of Unicode (with no
> re-interpretation trick), no human user will ever see those  &nbsp;  things.
> All that humans will see is some displayable form of the  NON-BREAK SPACE
> character, which happened to be encoded as  0x26 0x6E 0x62 0x73 0x70 0x3B
> rather than as  0xNN1 0xNN2 (the two bit patterns being equivalent).  

Absolutely.  Which is why I'm not worried about how to serialize internal
Unicode as UTF-8+names; no program but an editor (which always has
special considerations of how faithful it needs to be to the input)
has to concern itself with that.

-- 
Do NOT stray from the path!             John Cowan <jcowan@reutershealth.com>
        --Gandalf                       http://www.ccil.org/~cowan

References:
- Re: [xml-dev] UTF-8+names
  - From: John Cowan <cowan@mercury.ccil.org>
- RE: [xml-dev] UTF-8+names
  - From: "Alessandro Triglia" <sandro@mclink.it>

Prev by Date: RE: [xml-dev] UTF-8+names
Next by Date: Re: [xml-dev] UTF-8+names
Previous by thread: RE: [xml-dev] UTF-8+names
Next by thread: Re: [xml-dev] UTF-8+names
Index(es):
- Date
- Thread