xml-dev - Re: [xml-dev] Character Entities: An XML Core WG View

Re: [xml-dev] Character Entities: An XML Core WG View

[ Lists Home | Date Index | Thread Index ]

To: davidc@nag.co.uk (David Carlisle)
Subject: Re: [xml-dev] Character Entities: An XML Core WG View
From: John Cowan <jcowan@reutershealth.com>
Date: Fri, 1 Nov 2002 08:54:13 -0500 (EST)
Cc: jcowan@reutershealth.com, xml-dev@lists.xml.org
In-reply-to: <200211010922.JAA20767@penguin.nag.co.uk> from "David Carlisle" at Nov 01, 2002 09:22:22 AM

David Carlisle scripsit:

> Comments on "Character Entities: An XML Core WG View"

As the principal author of that document, I am responding here solely
in my own person, and do *not* represent the Core WG in any way.

> Acknowledging the usability problems with the current mechanisms and
> investigating the possibilities for alternatives should not commit
> anyone to adding any mechanism in a future XML 2 (if there were ever to
> be such a version).

There is no reason why other people cannot explore such mechanisms; the Core
WG is simply recording its current view, that there is no compelling need
for *it* to do so.

> If you are using an XML application that forbids (at the application
> level) the use of <!DOCTYPE, the fact that this is allowed by the XML
> spec does not really help.

My personal view is that that is a defect in SOAP, and it is up to the
SOAP WG to solve the problem for themselves, if in fact they believe there
is a problem at all.  Most of our arguments are addressed to the needs of
document authoring, whereas SOAP messages will usually be generated
automatically.

> However given the pressure from some quarters to move from
> dtd to schema languages of one sort or another, this is likely to become
> more rather than less common.

W3C XML Schema and its competitors do not attempt to do everything that
DTDs can do, and for some purposes they remain indispensable.

> As noted above, this facility may not be available at all. Even when it
> is, it is only barely usable for hand authored documents (which as you
> comment in the introduction is a main use case for entities of this
> form). The idea that every time you use a character by name you have to
> (a) know the required definition and (b) go up to the top of the
> document to add the entity declaration, has severe usability problems.

Telephone dials have severe usability problems: the numbers are in an
illogical order, and they require us to maintain private databases mapping
our friends, relations, and business partners to these arbitrary digit
strings (length <= 15).  Nevertheless they are used by millions daily.

> >  Most character names have already been standardized by
> >  ISO, and these names should be and are used wherever possible. 
> 
> "most" characters have not had names standardised by ISO (or anyone
> else) unless you are thinking solely of characters used in common
> European languages.

I said that most names were standardized, not that most characters have
standardized names.

> Also XHTML is incompatible with the usual ISO definitions
> (asymp and circ for example) which causes some problems for MathML which
> tries to be in agreeement with both.

That is a genuine problem which needs to be fixed by work such as Rick
Jelliffe is doing.  There is no reason why the same name should refer to
different characters in different contexts (though there may be occasional
reasons to preserve more than one name for a character).

> In addition Unicode/ISO chose not to support the full set of characters 
> that have ISO entity names even in the additions in Unicode 3.x, 

Details would be very useful.

> Docbook for example maps several characters to #FFFD
> http://www.oasis-open.org/docbook/specs/wd-docbook-xmlcharent-0.3.html#d0e184
> wheras MathML  attempts to map all of these names to some more (or
> less) suitable character.

I'm looking into this now.

> I would agree with this, but it is worth noting that W3C I18N group
> takes a very hard line against any public use of the PUA.

That's because private agreements don't scale on the Web.  Not all uses
of XML are Webby, though.

-- 
Some people open all the Windows;       John Cowan
wise wives welcome the spring           jcowan@reutershealth.com
by moving the Unix.                     http://www.reutershealth.com
  --ad for Unix Book Units (U.K.)       http://www.ccil.org/~cowan
        (see http://cm.bell-labs.com/cm/cs/who/dmr/unix3image.gif)

Follow-Ups:
- Re: [xml-dev] Character Entities: An XML Core WG View
  - From: David Carlisle <davidc@nag.co.uk>

References:
- Re: [xml-dev] Character Entities: An XML Core WG View
  - From: David Carlisle <davidc@nag.co.uk>

Prev by Date: Re: [xml-dev] Character Entities: An XML Core WG View
Next by Date: Re: [xml-dev] Character Entities: An XML Core WG View
Previous by thread: Re: [xml-dev] Character Entities: An XML Core WG View
Next by thread: Re: [xml-dev] Character Entities: An XML Core WG View
Index(es):
- Date
- Thread