OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Processing XML 1.1 documents with XML Schema 1.0 processor

[ Lists Home | Date Index | Thread Index ]

On May 14, 2005, at 14:15, Rick Jelliffe wrote:

> (I smell a troll.)

It was an honest question. When I look around how software development 
works in Finland and compare that with the alleged requirements for 
other locales I can't help but to think that perhaps English speakers 
are trying to be over-polite at the expense of XML 1.0 compatibility 
without the actual demand being there.

> Henri Sivonen wrote:
>> BTW, is there any actual research about the demand for non-ASCII 
>> element names? XML 1.0 allows a large chunk of non-ASCII on element 
>> names. Is any real-world XML vocabulary actually exercising the 
>> freedom to go beyond ASCII in element and attribute names (except 
>> perhaps some vocabulary that is only used in Japan)?
> What the **** does that question mean?

The rationale for migrating to XML 1.1 is that one could use element 
names in languages that XML 1.0 does not allow in element names. XML 
1.0 already allows element names many non-ASCII languages.

To assess whether this rationale for XML 1.1 make practical sense, it 
would seem natural to observe whether people are actually using to 
non-ASCII possibilities of XML 1.0. If research shows that the 
non-ASCII possibilities provided by even XML 1.0 are not actually used 
to a significant extent, why bother with breaking interoperability by 
extending the non-ASCII features?

The parenthetical note about Japan was there because someone once gave 
me a Japanese data point. Also, Japanese is different from other 
non-English languages in the sense that you can actually get developer 
docs in Japanese in addition to English. So perhaps one can be a 
programmer who reads Japanese but doesn't read English.

> That element names only used in one country should not be supported in 
> a standard designed to suit the whole world?

No that was not my point.

> It is a simple fact that ASCII transliterations of many languages, in 
> particular those with tonal pronunciation, homophones and idoegraphic 
> scripts, can frequently be incomprehensible.

That's not the point, either. Finnish uses Latin letters making dumbing 
down to ASCII more feasible, but we still use English-based XML element 
names and Java variable names even though both allow Finnish letters.

> (Add to this that there are regional concepts (e.g. in addresses) for 
> which there may be no English analog.) The most direct way of putting 
> the question is "Why should W3C put out a standard that arbitrarily 
> makes things easier for white people than for yellow people?" A space 
> can easily be replaced by a "_": what should the ideograph for a 
> mountain be replaced by: the sound, the meaning, a translation? How 
> does a reader reconstruct the ideograph?

I wasn't suggesting that. I was wondering if the programmers whose 
native language is not English still use English when they write code 
like we do here in Finland.

> XML's name rules are important precisely because they don't adopt the 
> bogus minimalist approach. I am not saying that anyone who wants 
> ASCII-only markup is a greedy, lazy, selfish, unjust, uncaring, 
> clock-back-turning,  unpragmatic racist or Western supremicist;

Speaking of unpragmatism, I think it would be cruel to make me use the 
Finnish keyboard layout to type the markup and programming 
language-significant punctuation that has been optimized for the U.S. 
keyboard. But instead of suggesting that everyone should be able to 
redefine their non-terminals as in SGML, I press command-option-space 
to cycle keyboard layouts.

> But ISO standards like SGML must support International requirements, 
> and W3C profiles like XML must support world-wide adoption.

Sure, but considering that XML 1.0 is what it is, is the 
interoperability trouble caused by fixing it really worth it?

> A less inflammatory response is that the importance of names in markup 
> is not that they are easy to write, but that they are meaningful to 
> read. The better analogy to make isn't the inconvience of making you 
> write ASCII, but the inconvenience if you had to write using, say, 
> Greek characters. You  probably could do it, but it would add a layer 
> of inconvenience that would probably make you avoid using the 
> technology where you had a choice.

Writing Finnish and programming punctuation (;{}[]<>/\=) at the same 
time is inconvenient given the usual input methods. I'd imagine the 
inconvenience with non-Latin writing to be even greater.

Henri Sivonen


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS