xml-dev - Re: [xml-dev] UTF-8+names

Re: [xml-dev] UTF-8+names

[ Lists Home | Date Index | Thread Index ]

To: James Clark <jjc@jclark.com>
Subject: Re: [xml-dev] UTF-8+names
From: Tim Bray <tbray@textuality.com>
Date: Sun, 19 Oct 2003 21:02:34 -0700
Cc: xml-dev@lists.xml.org
In-reply-to: <1066620398.21207.112.camel@rambutan.bkk.thaiopensource.com>
References: <000c01c3966d$7fbb9f70$42a7c044@aldebaran> <3F92E11F.4050409@textuality.com> <1066620398.21207.112.camel@rambutan.bkk.thaiopensource.com>
User-agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.5) Gecko/20031007

James Clark wrote:

> But with +names you don't want to work at the encoding level.  For
> example, if you have a ü in your text file, that will be two bytes in
> UTF-8+names, but you would want to work with it as a single character.
> To edit a UTF-8+names text file, you need to make your text editor treat
> it as if it were encoded in UTF-8. In other words, to make things work
> you have to edit it in the wrong encoding.  This will be extremely
> confusing to users.

I'm not sure I agree.  In UTF-8+names, ü could show up either as itself 
as &uuml; - if you had the gear that could handle it as itself just put 
it in that way and you're fine.  For things like &Conint; that it's very 
unlikely you can edit in place, leave it as &Conint; in the edit window.

> 1. General publishing. This community wants the HTML entity sets.  I
> think the problem here is a software/education problem which is
> decreasing all the time.  Almost all modern systems have fonts that can
> display almost all the characters in these entity sets. The desktop
> environments that I'm familiar with all offer a character map applet
> which is sufficient (albeit not very efficient) for entry of characters
> which you have fonts. The quality of Unicode support offered by standard
> text editors is improving all the time.

I basically agree.  Lots of people out there don't, last time I checked 
the HTML Working Group was still officially outraged that the schema 
meta-designers of the world (e.g. you :)) were steadfastly refusing to 
go near entities.  But given that I recently figured out how to do this 
stuff in Emacs, I suspect the days when everyone will have the necessary 
gear are coming fast.

> 2. Math.  I think math users have special requirements.  

I know the idea of using elements has been proposed a couple of times, 
but I don't know how it was reacted to.  But it's certainly the case 
that I've had many arguments where I've been telling people to get past 
character entities and get with the Unicode program, and the argument 
always ends up with "what about the poor Math people?"

My considered opinion on this has for a long time been that the best 
thing to do about the entity problem is *nothing*, keep the pressure on 
the technology to get smarter.  Unfortunately my saying that hasn't made 
the people who are saying they need this stop saying that they need 
this.  The +names proposal is the only thing I could think of that had 
the remotest hope of giving them what they want and actually getting 
implemented.  I suspect that if +names doesn't pick up traction, what 
they're going to get is nothing.  Which may not be a bad outcome.

-- 
Cheers, Tim Bray (http://www.tbray.org/ongoing/)

Follow-Ups:
- Re: [xml-dev] UTF-8+names
  - From: "Bob Foster" <bob@objfac.com>
- RE: [xml-dev] UTF-8+names
  - From: "Alessandro Triglia" <sandro@mclink.it>

References:
- RE: [xml-dev] UTF-8+names
  - From: "Alessandro Triglia" <sandro@mclink.it>
- Re: [xml-dev] UTF-8+names
  - From: Tim Bray <tbray@textuality.com>
- Re: [xml-dev] UTF-8+names
  - From: James Clark <jjc@jclark.com>

Prev by Date: ANN: Syntext Serna Beta-3 Release: XSL-on-the-fly WYSIWYG XML Editor
Next by Date: Re: [xml-dev] UTF-8+names
Previous by thread: Re: [xml-dev] UTF-8+names
Next by thread: RE: [xml-dev] UTF-8+names
Index(es):
- Date
- Thread