XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] IRIs - Question

Michael Kay wrote:
> The XML Namespaces 1.0 specification says (section 2.1)
>
> "An XML namespace is identified by a URI reference [RFC3986]"
>
> Which would make your namespace name incorrect. However, there is no rule in
> the spec that says your document is not namespace-well-formed; and this is
> not an accidental omission, there has been intensive debate on the subject.
> I argued quite strongly that the spec should either make it mandatory for a
> namespace name to be a valid URI, or should explicitly make it legal to use
> any old character string that you fancy; but the WG in its wisdom, or more
> likely in its lack of consensus, failed to take either of those options.
>
> The vast majority of XML products do in fact allow you to use any old
> character string that you fancy. An exception is XOM, which takes a rather
> purist view (one which in my opinion is not justified by the
> specifications), and will probably reject your use of
> http://großerJob.german.com 
>
> XML Namespaces 1.1 allows the namespace name to be an IRI, which allows your
> choice, but again it pointedly refuses to say that the document is
> ill-formed if the name is not a valid IRI.
>
> Pragmatically, (a) the specs refuse to make a clear unambiguous statement on
> this issue, (b) your chosen namespace will work with nearly every popular
> XML product, the only exception I know of being XOM, but (c) you could be
> inviting unnecessary trouble due to character encoding issues.
>
> Outside the scope of namespaces, support for non-ASCII characters in URIs on
> the web (that is, support for IRIs) seems very patchy. I did some
> experiments for example creating HTML pages that link to the site
> http://www.münchen.de/ 

the non-ASCII character here is not part of the path, but of the 
internationalized domain name (IDN). Before the domain name gets onto 
the wire, IDN processing happens in the client, which results in an 
ASCII-representation of the domain name. See the article
http://www.w3.org/International/articles/idn-and-iri/
for more explanations and examples.

> (which redirects to www.muenchen.de), with and
> without percent-encoding of the URI references, and the results were not
> encouraging.
>   

such results are probably due to a missing implementation of IDNA , but 
not due to problems in IRI processing. IDNA is defined in
http://www.rfc-editor.org/rfc/rfc3490.txt and others, IRI in
http://www.rfc-editor.org/rfc/rfc3987.txt

Felix



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS