[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] xml over http - RFC 3023
- From: Rick Jelliffe <rjelliffe@allette.com.au>
- To: xml-dev <xml-dev@lists.xml.org>
- Date: Mon, 01 Dec 2008 18:46:40 +1100
Andrew Welch wrote:
> Hi all,
>
> There's a very good article here about the problem of reading feeds
> from all over the world in different encodings:
>
> http://www.xml.com/pub/a/2004/07/21/dive.html
>
That article, when it came out, was a little irritating, because it
claimed to have discovered that, behind the mechanisms that we had put
into place in XML to workaround the crapulous problems in the
internet/MIME/HTTP specs, there was a problem.
It is like seeing a plaster cast on a broken arm and saying "I have
discovered that arm is broken!". Or, in the way the heading was worded,
saying "I have discovered that plaster casts do not prevent broken arms:
look underneath it, the arm is broken!"
The out-of-band signalling of character encoding is a fundamentally
broken idea, because there are no mechanisms for programs which generate
data to memoize the character encoding used that can then feed the rest
of the food-chain. It was workable before the WWW and outside of East
Asia, but as soon as UTF-8 came along it was impractical even for the
West: this was obvious by the mid-90s. So what does a standards group do
when the official standards are broken and there is little hope of
fixing them? It creatively ignores them. Ignoring dumb standards is a
virtue.
So XML got the XML header in the full knowledge that many (most) web
systems that used text/* implemented the ASCII default by being 8-bit
clean and non-transcoding, which leaves the XML file uncorrupted and the
XML header in full play.
> At the moment it all seems pretty complicated... especially
> considering XML was designed for the web. The problem of parsing
> feeds from all over the world must have tackled a few times over by
> now?
>
It is not complicated. Use application/xml
If you do find intermediate web systems that implement the ASCII default
or the IS8859-1 default as anything other than 8-bit clean for text/xml
submit a bug report.
If you find systems that accept text/xml but not application/xml then
find some way to discretely help the developers out of their
embarrassing bozo-the-clown moment.
Cheers
Rick Jelliffe
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]