xml-dev - Re: [xml-dev] Processing XML 1.1 documents with XML Schema 1.0 processor

Re: [xml-dev] Processing XML 1.1 documents with XML Schema 1.0 processor

[ Lists Home | Date Index | Thread Index ]

To: xml-dev@lists.xml.org
Subject: Re: [xml-dev] Processing XML 1.1 documents with XML Schema 1.0 processors
From: Henri Sivonen <hsivonen@iki.fi>
Date: Sat, 14 May 2005 00:26:11 +0300
In-reply-to: <428502B7.9020404@expway.fr>
References: <20050513105342.E0F5F3F43DC@gwparis.dyomedea.com> <1115982172.15341.95.camel@localhost.localdomain> <428502B7.9020404@expway.fr>

On May 13, 2005, at 22:40, Robin Berjon wrote:

> Yes this may break software that is making stupid assumptions about 
> the content of certain tokens, but such software was written based on 
> a misunderstanding of text and deserves to break (and then to be shot 
> in the kneecaps, tied to a horse and dragged all around town, dipped 
> in boiling lead, dismembered piece by piece with a rusty spoon, and 
> finally dumped in a ditch to agonize).

> How can XML be the universal data format without the ability to handle 
> universal text?

I can't use spaces in element names. My mother tongue uses spaces. I am 
being oppressed!

Being able to carry content in any language and being able to use 
anything in element names are two totally different things. The first 
one is crucial. The latter is not. In fact, the world keeps turning 
with XHTML, DocBook, SVG, OOo XML, Atom etc. using English-based ASCII 
element names. The point is that the content can be in any language. I 
think i18n political correctness goes overboard when interoperability 
is sacrificed in order to change the characters allowed in 
programmer-visible identifiers.

My mother tongue is not ASCII-safe. It also isn't invariant under 
canonical decomposition. When I design and XML vocabulary, I use 
English-based ASCII element and attribute names. I don't want to ever 
spend a single minute debugging an app, because someone was being 
politically correct and used umlauts in element names and then the app 
expected the decomposed form but the document had them in the 
precomposed form (or vice versa).

BTW, is there any actual research about the demand for non-ASCII 
element names? XML 1.0 allows a large chunk of non-ASCII on element 
names. Is any real-world XML vocabulary actually exercising the freedom 
to go beyond ASCII in element and attribute names (except perhaps some 
vocabulary that is only used in Japan)?

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Follow-Ups:
- Re: [xml-dev] Processing XML 1.1 documents with XML Schema 1.0 processors
  - From: Rick Jelliffe <ricko@allette.com.au>

References:
- RE: [xml-dev] Processing XML 1.1 documents with XML Schema 1.0processors
  - From: Eric van der Vlist <vdv@dyomedea.com>
- Re: [xml-dev] Processing XML 1.1 documents with XML Schema 1.0 processors
  - From: Robin Berjon <robin.berjon@expway.fr>

Prev by Date: Re: [xml-dev] Processing XML 1.1 documents with XML Schema 1.0 processors
Next by Date: RE: [xml-dev] ebXML SOAP and WSDL
Previous by thread: Re: [xml-dev] Processing XML 1.1 documents with XML Schema 1.0 processors
Next by thread: Re: [xml-dev] Processing XML 1.1 documents with XML Schema 1.0 processors
Index(es):
- Date
- Thread