OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Associating Style Sheets with XML documents 1.0(Second Edition)

On Thu, 11 Nov 2010 21:16:55 -0500, Liam R E Quin wrote:
> On Fri, 2010-11-12 at 00:08 +0000, Michael Kay wrote:
>> I thought there was another more technical reason for avoiding text/... 
>> - some theory that if it's text, the carrier is allowed to change its 
>> encoding, whereas if it's application/..., then it isn't.
> Yes.  A proxy can rewrite text/*, e.g. to change line endings, or, more
> insidiously, changing the encoding (and the corresponding MIME header).
> Once the encoding is changed, the XML declaration in the file becomes
> incorrect...
> I don't know how common rewriting proxies are in practice.

For MIME-compliant protocols, particularly in the days of five-bit 
gateways, there were a fair number.  That was always mostly an issue 
with email.

For HTTP, which is 8-bit clean, and which is not MIME-compliant[1], I 
doubt that it's an issue.

The general rule, for MIME types, is that the text/* hierarchy can be 
represented in 7-bit US-ASCII (and in MIME, must be, if no other 
"charset" or encoding is specified).  MIME also provides 
quoted-printable encoding ("quoted-printable" is rather a misnomer, 
though it wasn't *so* far off back in the day).  It can also be 
reflowed, and various transforms applied to it (so long as they are, in 
theory, reversible and preserve the seven bits of data).

It would have been nice if the HTTP guys, instead of writing something 
that was not-quite-MIME, had defined a not-quite-MIME (or MIME 2.0) 
specification, so that other protocols could use it.  This would 
presumably be signaled by the presence of a header (MIME-Version: 2.0, 
or MIME-Variant: REST, or whatever); this could then indicate that 
Content-Transfer-Encoding is forbidden (and the channel is 8-bit), that 
Content-Length is strongly recommended, that Content-Encoding and 
Transfer-Encoding are usable.  It might, as well, change the meaning of 
"text/*" Content-Type headers to default to UTF-8 in the absence of a 
charset (instead of US-ASCII).  It *would* be awkward to do (just as 
early HTTP use of encoding indicators in headers to indicate the 
encoding of headers was ... uh, a little tricky, say).  Still better 
might be to redefine the NVT for Unicode, perhaps with a "BOM" header 
to indicate header encoding and preferred line-ending.

All of that, though, is deep infrastructure stuff that nobody really 
wants to mess with, and if it were defined, figuring out the path for 
adoption would be *hairy*.  No MIME-compliant protocol processor is 
written to cope with MIME-Version != 1.0.  HTTP couldn't extract it's 
MIME-alike bits into a separate spec without a spec refresh.  Backward 
compatibility could be managed, but nobody would be able to use the new 
stuff because all the software out there lacks forward compatibility.  
So ... *shrug*.  8-bit HTTP makes use of MIME types, which are defined 
for 7-bit transports, and the impedance mismatch tends to be visible, 
even if the reasons for it are no longer obvious.

[1] RFC 2616, 19.4.1.

Amelia A. Lewis                    amyzing {at} talsever.com
Better to have thirty minutes of wonderful than a lifetime of nothing

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS