xml-dev - Re: [xml-dev] Character Entities: An XML Core WG View

Re: [xml-dev] Character Entities: An XML Core WG View

[ Lists Home | Date Index | Thread Index ]

To: David Carlisle <davidc@nag.co.uk>, xml-dev@lists.xml.org
Subject: Re: [xml-dev] Character Entities: An XML Core WG View
From: Paul Prescod <paul@prescod.net>
Date: Fri, 01 Nov 2002 03:58:05 -0700
Cc: jcowan@reutershealth.com
References: <200210302155.QAA19553@mail2.reutershealth.com> <200211010922.JAA20767@penguin.nag.co.uk>
User-agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:1.1) Gecko/20020826

David Carlisle wrote:
> ...
> 
> "most" characters have not had names standardised by ISO (or anyone
> else) unless you are thinking solely of characters used in common
> European languages.
> 
> Also XHTML is incompatible with the usual ISO definitions
> (asymp and circ for example) which causes some problems for MathML which
> tries to be in agreeement with both.

and

Rick Jellife wrote
> 
> Now to do this requires an agreement on what the best mappings for
> entities to Unicode strings are.  I have been involved in a project to do just this,
> for the last few months, with the intent of taking it to ISO: the task
> mainly involves cross-checking DOCBOOKs mappings with W3C
> MathML's mappings, and then going through issues from other sources.  
> XML-DEV-ers may be interested in the status of this.

Now that Unicode gives English names to all characters, couldn't we say 
that all pre-Unicode names (SGML/ISO, XHTML/MatML/W3C, Docbook/OASIS 
etc.) are legacy names which over the long run could be replaced by 
entity names directly based upon Unicode names?

> The only approach that I have seen that makes sense is to build in
> a fixed standard set of characters into XML, with known mappings.
> Then, for some open-source mapping libaries to be made, so that
> developers can trivially add the mapping to their weeny parsers.

The Unicode name database is essentially open source and ships with some 
programming languages. Admittedly the names are verbose but short-forms 
are what internal entites are good for. For the occasional "funny" 
character I would actually prefer a long-but-verbose name to the 
short-but-cryptic ones SGML tradition prefers. When I need to use one 
over and over then I'll make an internal entity for it.

  Paul Prescod

Follow-Ups:
- Re: [xml-dev] Character Entities: An XML Core WG View
  - From: John Cowan <jcowan@reutershealth.com>
- Re: [xml-dev] Character Entities: An XML Core WG View
  - From: "Rick Jelliffe" <ricko@allette.com.au>
- Re: [xml-dev] Character Entities: An XML Core WG View
  - From: David Carlisle <davidc@nag.co.uk>

References:
- Re: [xml-dev] Character Entities: An XML Core WG View
  - From: David Carlisle <davidc@nag.co.uk>

Prev by Date: Re: [xml-dev] Character Entities: An XML Core WG View
Next by Date: Re: [xml-dev] Character Entities: An XML Core WG View
Previous by thread: Re: [xml-dev] Character Entities: An XML Core WG View
Next by thread: Re: [xml-dev] Character Entities: An XML Core WG View
Index(es):
- Date
- Thread