Lists Home |
Date Index |
David Carlisle wrote:
> "most" characters have not had names standardised by ISO (or anyone
> else) unless you are thinking solely of characters used in common
> European languages.
> Also XHTML is incompatible with the usual ISO definitions
> (asymp and circ for example) which causes some problems for MathML which
> tries to be in agreeement with both.
Rick Jellife wrote
> Now to do this requires an agreement on what the best mappings for
> entities to Unicode strings are. I have been involved in a project to do just this,
> for the last few months, with the intent of taking it to ISO: the task
> mainly involves cross-checking DOCBOOKs mappings with W3C
> MathML's mappings, and then going through issues from other sources.
> XML-DEV-ers may be interested in the status of this.
Now that Unicode gives English names to all characters, couldn't we say
that all pre-Unicode names (SGML/ISO, XHTML/MatML/W3C, Docbook/OASIS
etc.) are legacy names which over the long run could be replaced by
entity names directly based upon Unicode names?
> The only approach that I have seen that makes sense is to build in
> a fixed standard set of characters into XML, with known mappings.
> Then, for some open-source mapping libaries to be made, so that
> developers can trivially add the mapping to their weeny parsers.
The Unicode name database is essentially open source and ships with some
programming languages. Admittedly the names are verbose but short-forms
are what internal entites are good for. For the occasional "funny"
character I would actually prefer a long-but-verbose name to the
short-but-cryptic ones SGML tradition prefers. When I need to use one
over and over then I'll make an internal entity for it.