OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] xml over http - RFC 3023

Andrew Welch wrote:
> Hi all,
> There's a very good article here about the problem of reading feeds
> from all over the world in different encodings:
> http://www.xml.com/pub/a/2004/07/21/dive.html
That article, when it came out, was a little irritating, because it 
claimed to have discovered that, behind the mechanisms that we had put 
into place in XML to workaround the crapulous problems in the 
internet/MIME/HTTP specs, there was a problem. 

It is like seeing a plaster cast on a broken arm and saying "I have 
discovered that arm is broken!". Or, in the way the heading was worded, 
saying "I have discovered that plaster casts do not prevent broken arms: 
look underneath it, the arm is broken!"

The out-of-band signalling of character encoding is a fundamentally 
broken idea, because there are no mechanisms for programs which generate 
data to memoize the character encoding used that can then feed the rest 
of the food-chain. It was workable before the WWW and outside of East 
Asia, but as soon as UTF-8 came along it was impractical even for the 
West: this was obvious by the mid-90s. So what does a standards group do 
when the official standards are broken and there is little hope of 
fixing them? It creatively ignores them. Ignoring dumb standards is a 

So XML got the XML header in the full knowledge that many (most) web 
systems that used text/*  implemented the ASCII default by being 8-bit 
clean and non-transcoding, which leaves the XML file uncorrupted and the 
XML header in full play.

> At the moment it all seems pretty complicated... especially
> considering XML was designed for the web.  The problem of parsing
> feeds from all over the world must have tackled a few times over by
> now?
It is not complicated. Use application/xml

If you do find intermediate web systems that implement the ASCII default 
or the IS8859-1 default as anything other than 8-bit clean for text/xml 
submit a bug report.  

If you find systems that accept text/xml but not application/xml then 
find some way to discretely help the developers out of their 
embarrassing bozo-the-clown moment.

Rick Jelliffe

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS