OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Comparison of URIs: Character encoding.

[ Lists Home | Date Index | Thread Index ]
  • From: Alan Kennedy <alank@xhaus.com>
  • To: xml-dev@lists.xml.org
  • Date: Sun, 26 Nov 2000 23:06:41 +0000

Hello again,

Another question about identifiers, this time URIs.

I need to compare URIs, both as SYSTEM identifiers and Namespace
identifiers. The question I need to answer is this:-

What character encoding should I use for encoding and decoding of
escaped values in URIs?
For example: if I see "%7e"("~" in USASCII) in a URI, what character
en(de)coding should I use to map that to a single character for
comparison purposes? What about "%e9" ("e-acute" in "iso-8859-1")?

Another example: If I see a non-USASCII character in an URI,
say "" ("u-umlaut"), should I escape that as "%fc", as in 
"iso-8859-1"? Or should I be using UTF-8?

Or is there no such universal mapping?

Again, TIA for any enlightenment.



News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS