OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Re: URIs, concrete (was Re: [xml-dev] Un-ask the question)

[ Lists Home | Date Index | Thread Index ]

I'm going to be an irritating little git, Uche.  Sorry.

On Sat, 2002-08-03 at 15:30, Uche Ogbuji wrote:
> [Amy wrote:]
> > Sorry, do we have any escaping rules?  I don't recall seeing such a
> > thing in the Namespaces rec (I'm not considering the anyURI type in W3C
> > XML Schema; does that have escaping rules?  Or interesting rules for
> > comparison?  *sigh*  Guess I'll go look ...).
> 
> Yes we do.  For example:
> 
> http://bête.com
> 
> Is an invalid URI, and thus an invalid namespace name.  It must be escaped to
> 
> http://b%eate.com
> 
> One thing I don't know is how this URI restriction interacts with the recent 
> opening up of DNS to i18n.

I can't actually find a justification for this.  It isn't in the
Namespaces recommendation, which is fairly silent on what a URI is. 
Instead, the recommendation points at RFC 2396.  Section 2 of RFC 2396
discusses representations of URIs, and the generalized escape mechanism.

It is important to note, however, that the RFC delegates *all* authority
over which characters are reserved for which components to the component
... that is, to the URI registration specification subsection dealing
with that particular part of that particular URI scheme.

Or in other, other words, you may well have a requirement that URIs be
legal and valid, per the scheme's constraints, before it is transformed
into a namespace name.  Once it has been so transformed, it is not
possible to unescape it.  Since the escape mechanism happens before a
namespace name can be used, and there is no valid unescape mechanism,
then it does not make sense to speak of an escape mechanism.  What you
have, instead, is just a string of characters.  This string should
follow the rules to create a valid URI in some scheme, encoded for
computer-based transmission, but it doesn't matter, because the
namespace recommendation says you can't modify it, or interpret it, in
any useful fashion.

Note that your example, above, is an invalid URI for computer
transmission, but would be allowed, pretty explicitly, by RFC 2396.  So
blame the mess on TimBL, maybe.  But it seems fairly clear that there is
no two-way activity happening.  If you get something that contains
%61%6d%79, you are *not* allowed to read it as 'amy'.  The namespaces
recommendation gives you no permission to unescape the encoded
characters.

Amy!
-- 
Amelia A. Lewis       amyzing@talsever.com      alicorn@mindspring.com
You like the taste of danger, it shines like sugar on your lips,
and you like to stand in the line of fire
just to show you can shoot straight from your hip.
There must be a 1000 things you would die for; 
I can hardly think of two.
		-- Emily Saliers




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS