[
Lists Home |
Date Index |
Thread Index
]
- To: "'xml-dev'" <xml-dev@lists.xml.org>
- Subject: RE: [xml-dev] Character Entities: An XML Core WG View
- From: "Jelks Cabaniss" <jelks@jelks.nu>
- Date: Fri, 1 Nov 2002 01:31:50 -0500
- Importance: Normal
- In-reply-to: <3DC206EA.9050408@textuality.com>
Tim Bray wrote:
> I see your point, but there are all these people out there
> who keep saying they want a way to give funny characters
> human-readable names and don't want to use elements because
> they think structure and content are different. No matter
> how many times they are told that they shouldn't really need
> the names and that if they did they should use elements,
> they keep refusing to take our word for this, so we're gonna
> have to do something. Sigh.
> The WG's approach does at least have the virtue that it works with
> existing software.
Indeed.
> I despise entities in general more and more with each passing year,
> but it's pretty clearly character entities that are the bit that
> just won't go away; I seem to recall weeping with James Clark over
> this into our 18th or 19th glasses of red wine at the last XML
> conference.
Because they don't round trip after parsing? Or because of having to
expand the entities before you can use them?
> I know I don't when I'm in rdhead or oweenie mode - 몾 does the
> job fine -
It does, but &#xnnn;'s scattered throughout a document are hard to
proof. That's the only reason people want names (and not as
elements!:).
> but people who want to edit XML by hand really want to be able to use
> € and the like.
Yes. In fifteen or so years, when purely ASCII/ANSI/ISO-* editors are
history, I doubt if anyone will care, but I don't see the point in axing
the internal subset at this point in time. I'm not sure I see the point
of axing it in the future either.
> Once again, sigh. I haven't seen a better idea, but one would be
> welcome. Hmm, has anyone suggested
>
> &#uCYRILLIC-CAPITAL-LETTER-TSE; (aka Ц) or
> &#uPARTIAL-DIFFERENTIAL; (aka∂)
Again, why exactly -- except for "round-tripping" -- is a huge built-in
Unicode character reference database (that changes with every rev of
Unicode) better than having the convenience of being able to declare
&Tse; and the few others you might want in the internal subset?
/Jelks
|