OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Why is Encoding Metadata (e.g. encoding="UTF-8) put Inside the XML Document?

One thing that your notes don't emphasise is that if the encoding is
specified in the http headers (or similar external metadata) any
encoding specified in the file must be ignored.


      There are several reasons that the charset parameter is
      authoritative.  First, some MIME processing engines do transcoding
      of MIME bodies of the top-level media type "text" without
      reference to any of the internal content.  Thus, it is possible
      that some agent might change text/xml; charset="iso-2022-jp" to
      text/xml; charset="utf-8" without modifying the encoding
      declaration of an XML document.  Second, text/xml must be
      compatible with text/plain, since MIME agents that do not
      understand text/xml will fallback to handling it as text/plain.
      If the charset parameter for text/xml were not authoritative, such
      fallback would cause data corruption.

The reasons (quoted above) for this are sound in theory but a real pain
in practice where it is often easy to put the right encoding
declaration in the XML file and hard to get the right encoding specified
in the mime headers. 


The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs. 

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS