[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [xml-dev] Text/xml with omitted charset parameter
- From: Tim Bray <tbray@textuality.com>
- To: xml-dev@lists.xml.org
- Date: Fri, 26 Oct 2001 08:18:56 -0700
At 02:28 PM 26/10/01 +1000, Rick Jelliffe wrote:
> From: "Bjoern Hoehrmann" <derhoermi@gmx.net>
>
>> So, who tells me I
>> am wrong and text/xml documents without charset parameter may still be
>> UTF-8 encoded (and use non-ASCII characters)? ...
>
>The only ways out of encoding hell are:
Actually, XML *improves* the situation. To quote Larry Wall,
"An XML document knows what encoding it's in". So, given the
(not uncommon) scenario of mime-header breakage, you can often
recover. A decent XML processor, given a stream of bytes and
no other information, almost always does the right thing.
Per IETF dogma, the XML spec and the RFC both say that the
charset header is authoritative. Well, yes, except when it
isn't. Software that ignores it when it's demonstrably
wrong is hard to get too angry at. -Tim