xml-dev - Re: ISO 10646 vs. Unicode in XML specs

Re: ISO 10646 vs. Unicode in XML specs

[ Lists Home | Date Index | Thread Index ]

From: John Cowan <cowan@locke.ccil.org>
To: abrahams@acm.org
Date: Sun, 30 Jul 2000 23:47:44 -0400 (EDT)

On Fri, 28 Jul 2000, Paul W. Abrahams wrote:

> Of course, Unicode characters and ISO/IEC 10646 characters
> are essentially, or perhaps exactly, in 1-1 correspondence.

They are exactly in correspondence.  No character is entered into
either standard until the character *and* its code value are
agreed upon by both the Unicode Consortium and the ISO WG.

> The differences between the two standards vis-a-vis XML are
> very subtle.   So what puzzles me is this: why do some
> mentions of characters refer to ISO/IEC 10646 while others
> refer to Unicode?   Putting it another way: were I writing
> an XML-related spec, how would I decide which standard to
> refer to?   Why don't all specs refer to Unicode rather than
> to ISO/IEC 10646?

ISO 10646 is an international standard, whereas Unicode is a
corporate standard like XML itself.  International standards have
wider acceptance and better stability in general, though in this
particular case the distinction is moot.  Therefore, the international
standard is referred to when possible.

However, distinctions between different groups of characters, such as
alphabetic characters vs. ideographic ones, letters in general vs.
symbols, and compatibility characters vs. the rest are made only by
Unicode.  Abstractly considered, Unicode builds on the 10646 core
by adding many character properties; 10646 has only a few (diacritic
vs. base is one).  Therefore, in order to import these properties it
is necessary to use Unicode.

Disclaimer:  I wasn't there, and I am *not* speaking for the W3C Core WG.

-- 
John Cowan                                   cowan@ccil.org
C'est la` pourtant que se livre le sens du dire, de ce que, s'y conjuguant
le nyania qui bruit des sexes en compagnie, il supplee a ce qu'entre eux,
de rapport nyait pas.               -- Jacques Lacan, "L'Etourdit"

Follow-Ups:
- Re: ISO 10646 vs. Unicode in XML specs
  - From: Andrew Cassin <acassin@cs.mu.OZ.AU>

References:
- ISO 10646 vs. Unicode in XML specs
  - From: "Paul W. Abrahams" <abrahams@valinet.com>

Prev by Date: RE: Question About Namespaces and DTDs
Next by Date: Re: Why the Infoset?
Previous by thread: ISO 10646 vs. Unicode in XML specs
Next by thread: Re: ISO 10646 vs. Unicode in XML specs
Index(es):
- Date
- Thread