Lists Home |
Date Index |
At 2003-02-16 10:35, Mike Champion wrote:
>On Sun, 16 Feb 2003 09:19:32 -0800, Tim Bray <firstname.lastname@example.org> wrote:
>>Fair enough, but if you remove all the unicode-character apparatus from
>>XML 1.0 you probably cut that in half. Which is one of the only
>>important *technical* differences between XML and SGML - SGML was really
>>underspecified on what a "character" was. At the end of the day XML's
>>main technical contribution may turn out to have been that it dragged
>>Unicode into the mainstream.
>Stupid question: Why couldn't XML incorporate Unicode by reference rather
>than spending half of the spec defining the "unicode-character apparatus"?
Because Unicode did not define a Name -- and what the spec
did say about identifiers is not normative and not really
formally enough defined to write a parser for. (One result
is that those like Tim Bray who wanted to just extract the
relevant character lists from the Unicode character set property
tables succeeded in moving what XML 1.0 says a little bit
away from what little the Unicode spec did say about
identifiers.) There is also the fact that the Unicode
Standard keeps changing.