[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: XML Blueberry
- From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
- To: xml-dev@lists.xml.org
- Date: Thu, 21 Jun 2001 18:33:48 -0400
At 5:16 PM -0400 6/21/01, John Cowan wrote:
>But in full generality that means each document has to bear the
>version number of Unicode that it assumes (for its names only, of
>of course, not for its character content). That means a series of
>updates to parsers are required, and control of the schedule is
>lost from W3C to Unicode. Is this a Good Thing or a Bad Thing?
>
That implies that it's one or the other. Clearly it's both and what's
needed is a rational comparison of the real advantages and
disadvantages of both approaches. How big are the benefits that are
gained by this? How big are the disadvantages? Clearly both exist.
I don't think the potential benefits outweigh the disadvantages.
Splitting XML into multiple incompatible implementations strikes me
as a very bad thing. And make no mistake, this is just the first
step, not the last. Unicode's got at least two more major iterations
left in it that will force changes in XML parsers if we tie XML that
closely to Unicode. It's not just blueberry but raspberry and
blackberry too, and maybe other flavors!
What do we get in exchange for imposing major costs of transitions on
developers around the world? Tags can now be written in a few extra
scripts. And note that I say scripts, not languages. Many of the
languages listed in the reqs have well-established traditions in
other scripts as well. For instance, Mongolian can be written in
Cyrillic. (Blame the Soviet Union for that, but it is a plausible
workaround for tag names in Mongolian.)
The only scripts that seem to really call out for native markup that
don't have it already are Khmer (a.k.a. Cambodian) and Amharic
(Ethiopic languages). I can at least imagine someone wanting to mark
up some text in one of those scripts while not being able to do so in
a Latin, Cyrillic, or Chinese-based script. However, I can't promise
you that any such person actually exists. Before I agree to the
forced obsolescence of all existing XML systems and software, I want
to know that there is a real need for this now, not just that it
would be a cool thing to have if somebody wants it someday.
As a demonstration, I'd want to see at an absolute minimum that it
was possible to use a computer in such a language (e.g. Amharic,
Tigre, Khmer) without also having some competence in a more prevalent
script like Latin or Cyrillic. I'd also want it demonstrated that
this was done via a different character encoding, and not merely by a
font mapping to some ASCII superset. (This is how the limited
Ethiopic software I've actually seen has all worked.)
If you can show that, then we can reasonably compare the costs and
benefits of this proposal. But if you can't show that, then you're
asking to impose very real costs on XML developers around the world
for what may well prove to be an illusory benefit.
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| The XML Bible (IDG Books, 1999) |
| http://metalab.unc.edu/xml/books/bible/ |
| http://www.amazon.com/exec/obidos/ISBN=0764532367/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://metalab.unc.edu/javafaq/ |
| Read Cafe con Leche for XML News: http://metalab.unc.edu/xml/ |
+----------------------------------+---------------------------------+