OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Why Use anyURI

What confusion?

The story of anyURI as I recall it (and I named it, IIRC) is roughly this:

1) 1990s standard URI syntax was complicated (%encoding), unclear
(pre-unicode bytes, not characters), and antagonistic to i18n (ASCII only,
for literal characters)

2) Browser address bars provided a friendlier syntax with these issues
resolved. When most people thought about URI, they thought it was the
syntax in the browser address bar and were confused by these other things.
No spaces?

3) XML took the stand that it was entirely appropriate for people to try
system identifiers in the looser syntax of the address bars in markup: I
think it is entirely the right decision for markup standards to support
people cutting and pasting from the address bar directly into their
document. If you look in the XML recommendation on SYSTEM identifiers you
will see some fine wordsmithing to allow the non-standard syntax.

4) In about 2005, the URI gods allowed the IRI spec which added proper
internationalization. Kudos Martin. (The URI specs have been clarified too
of course.)

5) In about 2008, the URI gods allowed the LEIRI spec (Legacy encoded
IRIs) which allowed more of what address bars do. (Kudos Henry, Richard
and Norm.)

So the brief for anyURI was to cope with 1, 2, 3 without the benefit of 4
and 5 (but coping with what they would have to contain.) Hence the name
"anyURI":  the value space is any kind of URI reference, and the lexical
space (address bar/IRI/LEIRI/URI reference) is so diffuse that the XML
Schema group found it too hard or too low value to provide any checks.

So what should your reaction be on discovering that anyURI is so slack? It
should be first to check "Am I actually wanting an LEIRI, an IRI, a URI
reference, or anything URL-ish that any system might make me write?" Which
often comes down to "Do I want to allow cut and paste from address bars?
Is there value in Internationalization for this (so that Chinese
writers/readers of the markup can see Chinese characters, for example)?"
Once you make that decision, then you are in a position to restrict the
lexical space of anyURI to what you want.

If this is confusing, then the problem should be resolved by either having
the XML Schema WG put in a new built in simple type derived from anyURI
for, say, IRIReference. (The idea that anything could be removed from XML
Schemas is lovely and charming.)


>> In XSD 1.1 the WG admitted defeat and
>> changed the spec so the lexical space is exactly the same as xs:string.
>> The
>> XQuery and XSLT specifications also treat xs:anyURI more-or-less as
>> equivalent to xs:string.
> Interesting... why not deprecate it?  Leaving it in is just going to
> repeatedly cause this confusion isn't it?
> --
> Andrew Welch
> http://andrewjwelch.com
> _______________________________________________________________________
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS