OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: multiple encoding specs (Re: IE5.0 does not conform to RFC2376)

[ Lists Home | Date Index | Thread Index ]
  • From: Chris Lilley <chris@w3.org>
  • To: John Cowan <cowan@locke.ccil.org>
  • Date: Sun, 11 Apr 1999 03:15:44 +0200

John Cowan wrote:
> Rick Jelliffe wrote:
> > However, it is all spoiled if there are systems which corrupt the
> > labels: for example by rewriting the charset parameter incorrectly. It
> > is far better to send the XML file without a charset parameter than to
> > send it with a wrong one.

Yes. But even better to send it with a correct one. This is easily done;
just ensure that the server always sends the same charset that the XML
encoding declaration specifies.

> But there's the snag: in text/xml documents, a missing charset parameter
> does not mean "Charset unspecified"; it means "Charset specified
> as US-ASCII".  

This is correct, the RFC does say that. Note that, this thread is
primarily about whether the RFC *should* say that or *should* say
something different, something which does not needlessly contradict the
XML 1.0 Recommendation.

> There is no way to fail to specify a charset in
> text/* documents, and rightly so, because text without a charset
> is uninterpretable.

This is disingeneous; both clauses are true, but the second one implies
that there is no other method of conveying the information, which,
clearly, there is.

a) There is no way to fail to specify a charset in text/* documents

But it does not have to be explicit. It can be implied.  good way of
formalising that implication would be to refer to the rules in the XML
1.0 Recommendation.

b) text without a charset is uninterpretable.

Also true, but that labelling is already defined in XML and handily
trravels with the document instance so that it is not lost as soon as
the document is saved to disk. 

> In SGML terms, omitting the charset in text/* documents is a mere
> minimization, whereas in application/* documents it is a true #IMPLIED.

Actually, if you read the XML Recommendation, then unless the charset is
UTF-8 or UTF-16, the charset (encoding declaration) is #REQUIRED


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS