OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] IRIs - Question

The XML Namespaces 1.0 specification says (section 2.1)

"An XML namespace is identified by a URI reference [RFC3986]"

Which would make your namespace name incorrect. However, there is no rule in
the spec that says your document is not namespace-well-formed; and this is
not an accidental omission, there has been intensive debate on the subject.
I argued quite strongly that the spec should either make it mandatory for a
namespace name to be a valid URI, or should explicitly make it legal to use
any old character string that you fancy; but the WG in its wisdom, or more
likely in its lack of consensus, failed to take either of those options.

The vast majority of XML products do in fact allow you to use any old
character string that you fancy. An exception is XOM, which takes a rather
purist view (one which in my opinion is not justified by the
specifications), and will probably reject your use of

XML Namespaces 1.1 allows the namespace name to be an IRI, which allows your
choice, but again it pointedly refuses to say that the document is
ill-formed if the name is not a valid IRI.

Pragmatically, (a) the specs refuse to make a clear unambiguous statement on
this issue, (b) your chosen namespace will work with nearly every popular
XML product, the only exception I know of being XOM, but (c) you could be
inviting unnecessary trouble due to character encoding issues.

Outside the scope of namespaces, support for non-ASCII characters in URIs on
the web (that is, support for IRIs) seems very patchy. I did some
experiments for example creating HTML pages that link to the site
http://www.münchen.de/ (which redirects to www.muenchen.de), with and
without percent-encoding of the URI references, and the results were not

Michael Kay
> -----Original Message-----
> From: Ramkumar Menon [mailto:ramkumar.menon@gmail.com] 
> Sent: 23 April 2008 20:48
> To: xml-dev@lists.xml.org; xsl-list@lists.mulberrytech.com
> Subject: [xml-dev] IRIs - Question
> I have a WSDL/XSD file whose targetNamespace is 
> http://großerJob.german.com. The namespace URI contains 
> characters from german language, as you can see.
> If use a designer tool to view and validate this WSDL/XML, 
> what should be the behaviour?
> a) Give an error stating that the targetNamespace is not in 
> an anyURI format?
> b) Proceed to percent encode it and then validate the URI.[as 
> per UTF-8 maybe]
> The confusion here is that the WSDL/Schema 
> viewing/interpreting in the tool by a human would be 
> difficult if I use all percent encodings in the URIs. 
> Shouldn't the tools detect the character set and 
> appropriately encode it. For instance, if I take a print out 
> of the document, I would prefer to view the native language, 
> as opposed to the encoded URIs.
> Humans should be able to read the URI in the viewer in the 
> native language as-is, whereas any tools that intend to 
> process it should treat it as a set of octets, and handle 
> them accordingly.
> So the  Q is : Should a designer tool emit errors when it 
> validates the document with the above behaviour?
> Please advise.
> Ram
> --
> Shift to the left, shift to the right!
> Pop up, push down, byte, byte, byte!
> -Ramkumar Menon
>  A typical Macroprocessor

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS