OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: Comparison of URIs: Character encoding.

[ Lists Home | Date Index | Thread Index ]
  • From: Mike Brown <mbrown@corp.webb.net>
  • To: 'Alan Kennedy' <alank@xhaus.com>
  • Date: Mon, 27 Nov 2000 14:28:53 -0700

> What character encoding should I use for encoding and decoding of
> escaped values in URIs?

See http://www.faqs.org/rfcs/rfc2396.html section 2.1
and http://www.faqs.org/rfcs/rfc2616.html section 3.2.2

Basically, RFC 2396 says that ASCII is used for %xx where xx is under 7F,
and that for non-ASCII characters it is a charset that is up to the scheme
and that there is no way to specify the charset in the URI.

Unfortunately, RFC 2616 (HTTP/1.1) doesn't address the
charset-in-%xx-encoded-URIs issue at all. The section I referred to just
addresses a more basic issue about comparison of ASCII-based URIs.

http://www.w3.org/TR/html401/appendix/notes.html#h-B.2.1 recommends that
HTML user agents, when interpreting URIs that appear in HTML attribute
values, assume UTF-8 by default, falling back on the encoded document's
charset if the URI doesn't resolve. But this only affects URIs that are
referenced in HTML documents.

Also, HTML user agents tend to submit HTML form data with the URL-encoding
based on the charset of the encoded document, possibly overridden by the
user, so if you're looking at a URI that was generated for an HTML form
submission, you can't assume that the %80-%FF sequences are necessarily

Section 2.2 of http://www.faqs.org/rfcs/rfc2141.html says that in a URN, the
%xx sequences *definitely* represent UTF-8 octets. However, the urn scheme
has nothing to do with the http scheme.

That's the best I can do...

   - Mike
Mike J. Brown, software engineer at            My XML/XSL resources:
webb.net in Denver, Colorado, USA              http://skew.org/xml/

  • Follow-Ups:
    • SVG ?
      • From: "Manos M. Batsis" <manosb@profile.gr>


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS