[
Lists Home |
Date Index |
Thread Index
]
- From: Tim Bray <tbray@textuality.com>
- To: <xml-uri@w3.org>
- Date: Mon, 15 May 2000 23:00:09 -0700
At 09:45 AM 5/15/00 -0400, Tim Berners-Lee wrote:
>This is a list set up - possible for a short term - to hold the discussion
>of whether XML namespaces should be URIs.
I am very busy and may not be able to keep up with the xml-uri mailing list,
although I promise to check in as time allows.
I apologize in advance for the length of this message. We're all busy but this is important. But here's a summary:
1. URIs are sound in their design, just as as TimBL claims
2. Namespaces set out to solve the problem of naming things, no more, and
they succeeded
3. It is reasonable to want more from namespace names, and for this
reason, the fact that they are syntactically URIs is good, as it leaves
open the door for the building of the Semantic Web
4. It is wrong to compromise the basic utility of namespaces by imposing
strict URI-ness on them
5. The use of relative URI references as namespace names is wrong and
dangerous and should be, at the least, deprecated
I think TimBL's framing of the question, as quoted above, is very apt and
cuts to the heart of things. The current live issue in the W3C is much
narrower - what to do about relative URI references - but probably can't be
solved without some deep thinking about the relationship between namespace
names and URIs. Note that the issue, while narrow, is important, first
because there are W3C recommendations in the field which are inconsistent
on this point, and there are other recommendations, notably the DOM, which
are hung up pending its resolution.
1. URIs Are Just Fine
To open, it should be said that nobody in this debate (as far as I can tell)
has so far challenged the basic soundness of the URL system of resource
addressing; for my money, it's one of the shining proofs of the virtues of
razor-edge simplicity in the history of technology. Further, and in
particular, nobody has challenged the virtue or utility of relative URI
references; anybody who does not use them is probably building fragile web
sites.
2. Namespaces Are Just Trying to be Names
If I may be pardoned for wordiness, let me quote the first three paragraphs
of the namespace spec:
We envision applications of Extensible Markup Language (XML) where a single
XML document may contain elements and attributes (here referred to as a
"markup vocabulary") that are defined for and used by multiple software
modules. One motivation for this is modularity; if such a markup vocabulary
exists which is well-understood and for which there is useful software
available, it is better to re-use this markup rather than re-invent it.
Such documents, containing multiple markup vocabularies, pose problems of
recognition and collision. Software modules need to be able to recognize
the tags and attributes which they are designed to process, even in the
face of "collisions" occurring when markup intended for some other software
package uses the same element type or attribute name.
These considerations require that document constructs should have universal
names, whose scope extends beyond their containing document. This
specification describes a mechanism, XML namespaces, which accomplishes
this.
The only problem the namespace spec set out to solve was that of naming.
My assertion is simply a statement of verifiable historical fact.
Here is a test case that really crystallizes the problem, for me: suppose I
have invented a handy new XML language, TML, for some purpose of my own that
is not material here. Suppose TML is to contain some structural elements
that are document-centric - for example bulleted lists. Suppose also that I
must also embed some mathematical formulae. Suppose finally that I want to
include a few graphs.
Today, thanks to the good work of the W3C and the simple use of namespaces,
this is pretty easy. The HTML, MathML, and SVG vocabularies respectively
have well-known namespace names, and there are good and free implementations
of software that does useful work with all three vocabularies. It is thus
very easy for me to write code that dispatches to the appropriate software.
3. Should We Want More?
This is a huge step forward, and it works today. Without namespaces it
wouldn't work. Is that enough?
Maybe not; the published namespaces for most XML dialects do not support
direct retrieval of machine-usable semantics for these dialects. Assuming
such specifications exist, and we can all agree that their arrival is a
worthwhile goal, making it easy to retrieve them would be a wonderful thing.
For this reason, it is good, I think, that namespace names are URIs, rather
than, say java package pathnames, because it leaves open the possibility of
an automated, machine-readable and machine-usable Web; the Semantic Web.
I have occasionally griped that we should have used the Java package naming
syntax, and it certainly would have avoided some of the pain we're now in,
but I'm not really serious; I really do believe in a future Semantic Web,
and URIs are the right way to stitch it together. Via, I believe, some sort
of packaging mechanism or other way to achieve the necessary and formalized
levels of indirection. [Claim: content-negotiation is not enough].
4. Keep Namespaces Working as Intended While Building the Semantic Web
But let us also not discard the great virtue of namespaces, the purpose they
were designed to fulfill, that of names for vocabularies.
If we decree, now, that namespace names really are URLs, then I argue that
the simple design goal of dispatching software to markup based on its
universal name is grievously compromised. Here's why:
One of the crucial (and I think good) aspects of the URL is its syntactic
opacity. Nothing very meaningful can be said about a resource, at any level,
based on its URL, until you retrieve it. This is not just a theological
point, but a deep one that has been learned at great cost by anyone who has
tried to implement a server, or a browser, or a spider, while ignoring it.
As we all know, the same URL can return different resources in successive
microseconds; at the same time, there are arbitrarily many different URLs
that can when dereferenced deliver the same resouce.
Given this, if a namespace name is really a URL in all its important
respects, then the actual contents of the string aren't important at all;
if I want to use it to dispatch to software in the intended way, I'd really
have to dispatch based on the contents of the resource that is yielded by
dereferencing it.
So for the time being, I think we have to, for the purposes of software
dispatching, treat namespace names in the way the namespace spec specifies,
namely as literal strings. Any attempt to be smart about this leads down
the slippery slope of having to dereference it and dispatching based on the
contents.
This doesn't bother me; I think that the basic URI design is flexible enough
that we can, for now, use URLs as names without closing off any significant
doors for the development of the Semantic Web.
5. Relative URI References are Lousy Namespace Names
And finally, the pointy end of the question now jabbing the XML community in
various tender and embarrassing places: what about relative URI references?
If I may quote tediously again from the namespace recommendation:
The namespace name, to serve its intended purpose, should have the
characteristics of uniqueness and persistence.
Relative URI references have many virtues; but they do not include either
uniqueness or persistence. Working with them underlines, if it were needed,
the point I made above: you really can't tell anything useful by examining a
URI as a string; you have to go get the resource.
Thus it is my view a huge bug that that the Namespace recommendation doesn't
forbid the use of relative URI references. There are only two consistent
ways to deal with this bug:
- try to kill it retroactively by deprecating the use of relative URIs
as namespace names. In this case "deprecating" covers a spectrum of
tactics ranging from warnings at the weak end, through a commitment to
avoid ever doing this in the W3C's work, to some attempt to rewrite
history and retroactively ban these things.
- say they're OK because namespace names really are URIs, and relative
references are well-proven and known to be good practice. The tactics
here also occupy a spectrum, ranging at the weak end from canonicalizing
away such usages as foo/././././bar through expanding them by applying
the BASE uri (if you happen to know it) to requiring that the resource be
retrieved and the dispatching based on it rather than its identifier.
For my money only the last of these is consistent.
6. Conclusion
In re-examining TimBL's message to which this is a response, it seems that
I've spent little time addressing his points. That's because I disagree
with so few. Yes, URIs are a central component of the Web Architecture;
there is no other reasonable way to contemplate pulling together the Web of
tomorrow; and great caution is to be advised in their use.
TimBL and I are in substantial agreement that vocabularies need to be
connected to the web, and the value of so doing will increase as we learn
how to package up semantics in more and better declarative forms. There is
lots of room for disagreement over the relative value of content-negotation
versus indirection via manifest, but that's just engineering tactics.
There's one key point of difference in play here; I think it's OK to, for
the moment, use URIs just as names, in parallel with figuring out how to
build the Semantic Web. TimBL sees this as deeply broken.
But in the here and now, those of us who build software for a living really
do need cheap, lightweight ways to name markup vocabularies. If we have to
dereference them to use them, we can't use them. Please don't take them
away from us. -Tim
***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************
|