xml-dev - RE: [xml-dev] URIs harmful

RE: [xml-dev] URIs harmful
[ Lists Home | Date Index | Thread Index ]
To: "Tim Bray" <tbray@textuality.com>,<xml-dev@lists.xml.org>
Subject: RE: [xml-dev] URIs harmful
From: "Joshua Allen" <joshuaa@microsoft.com>
Date: Fri, 26 Jul 2002 02:09:50 -0700
Thread-index: AcI0L3vviQTgn7KJTv+8zM3LDTrNRgASbpKg
Thread-topic: [xml-dev] URIs harmful
> There is no way in the existing architecture of the Web to find out
what
> the resource *is*.  There is no way to tell whether you're talking
about
> a time-varying bag of HTML bits, or the organization that xml-dev
exists

Wrong.  The web architecture has a thing called URIs, which are intended
to unambiguously identify things.  Axioms 1 and 2a of web architecture.

> Thus, claiming that your first assertion above is talking about the
web
> page is simply without basis in the Web architecture.  The assertion
is

It identifies *something*.  There is nothing is web architecture which
says that it *can't* identify a web page.  I was simply pointing out
that *if* I decide to make it identify the web page, I avoid some
serious problems.

> about the resource identified by the URI and (thank heavens) does not
> depend on what the resource is, for any given meaning of "is".

It had better be about the same resource as the resource identified when
somebody else uses the same URI.  The only time it doesn't matter is
when you have only HTTP, because HTTP never actually depends on the
resource -- HTTP only interacts with "representation proxies".

> If the working of RDF depends on an assumption that a resource *is* a
> bag of bits, then it's simply broken.

Nobody ever said that.  First, I don't believe that an http: URL
identifies a "bag of bits".  It identifies an active agent which returns
hypermedia representations in a synchronous fashion.  Second, the
semantic web doesn't depend on RDF.  The semantic web *does* depend on
axioms 1 and 2a of web architecture.

>    http://www.w3.org TimsProperties:Is TimsTaxonomy:VendorConsortium
> 
> and build a set of useful inferences from there.  Alternately, I could
> assert
> 
>    http://www.w3.org TimsProperties:Is TimsTaxonomy:HypertextDocument
> 
> and build on that.  Not only am I saying things about meaning, I'm
doing
> so in a way that loads smoothly into databases and supports all sorts
of
> useful automatic processing.

First, you are completely violating Axioms 1 and 2a of web architecture,
by making the identification function of a URI be dependent on
contextual information.

Second, you haven't escaped the problem at all.  You are still defining
your words in terms of other words.  You still need to achieve global
consensus on *some* set of globally recognized words.  How do you
achieve consensus on what TimsProperties:Is is?  For your URIs to be
useful globally, they would need to be hoisted or bootstrapped out of
your local context with at least a core set of globally unambiguous
terms.  But what would you call *those* terms?  Global Resource
Identifiers?  I am sorry to say it, but you can't have a conversation
unless you have words.  And words need to exist *before* dictionaries
can exist.

Third, your identifiers are only meaningful within your particular local
context, so sharing them among other contexts will require translations.
This is exactly the same as if the WWW were built from a bunch of
different hypertext systems (hypercard, etc.) with proprietary gateways
used to translate between them.  It is theoretically possible, but it
would have destroyed adoption, opened the door for vendors to profit
from lock-in, and would not scale.  The same with the semantic web.  If
you require something as basic as *words* to endure context switches
which have drastic impact on meaning, you might as well not have words
at all.

> > And if you start saying that http://www.w3.org IS the W3C, things
that
> > are perfectly reasonable and logical before such as "the owner of
> > http://www.w3.org"; become muddled and suspicious.
> 
> I think there is no evidence to support the paragraph above.  If it
> meets my needs to use that URI to denote an organization, and and I
have
> RDF properties whose domain is "organizations", why can't I go ahead
and
> do this?  The domain of *your* "ownerIs" property may be web pages,
and

I will repeat for the millionth time, you CAN go ahead and do that.
It's not a matter of law or coercion.  I CAN go ahead and call myself
king of San Francisco if I want.  Nobody can stop me.  I CAN go ahead
and say "bad" every time I mean "good".  Nobody can stop me.  In fact,
both things could be perfectly logical and useful in particular local
contexts, like getting drunk with friends.

However, IF I want people to be able to globally understand me, I will
try to use words that have more globally accepted meaning.  YOU can do
differently if you want.  Just don't expect many people to understand.
And for God's sake PLEASE don't run around giving people advice about
how to choose words for maximum audience comprehension. 

> thus your assertion is logically inconsistent with my statements which
> treat the URI as representing the organization.  What is the problem
> with this?  Surely nobody imagines that the universe of RDF properties
> are all mutually consistent?

Just because you can say things that are inconsistent with the way that
most other people use a word does not mean that consistency in word use
is not critical.  Assertions themselves are guaranteed to be messy and
contradictory.  But the actual *subjects* of those assertions (and
predicates and objects) need to be as unambiguous as possible.

> In fact, I suspect that with a little study, you could build some RDF
> properties that link from your assertions to mine, working around the
> inconsistency.  Paraphrasing into English "if w3.org has an owner (in

This is simple for the triples basis of semantic web.  I expect that
there will be all sorts of techniques for resolving inconsistencies in
systems of assertions, identifying assertions which should be discarded
as untrustworthy, and so on.  I can think of many that are easily
implemented with today's technology.

But that is all completely out the window if there is no effort to make
*words* have unambiguous meaning.  When two different people use the
same word/URI, they need to have reasonable confidence that they are
talking about the same thing.  Such a basic token of meaning exists in
every human communication systems, and it MUST exist.  If you want to
tell me that a URI is not really a "reasonable confidence" standalone
identifier, then you need to invent a "reasonable confidence" standalone
identifier.  I pity the person who tries to build a working
communication system in which there is no reasonable confidence that an
identifier will have the same meaning between references.

> But if you try to base anything on claims concerning what a resource
> *is*, you're off on the wrong foot.

I wish people would not misrepresent my point.  I am not saying that
anyone should try to *mandate* any of this.  You can still be completely
agnostic to the definition of what a resource *is* and recognize the
wisdom of Axioms 1 and 2a.  It DOESN'T MATTER what resource is
identified by a word/URI, so long as two people using the word/URI
independently can have reasonable confidence that they are talking about
the SAME resource.

Anything that works against this reasonable confidence or encourages
people to do things which shake this reasonable confidence is poor form.
But I am not even suggesting that we mandate against people having poor
form.  The people with poor form will most likely just get left behind,
and will not *hurt* the semantic web.  I am simply suggesting that W3C
lead by example, and refrain from doing things like say that http range
is infinite when:

A) nobody cares about range of http, since HTTP in practice is only
capable of dealing with representations anyway, so the actual resource
is a vanity which requires no consensus
B) doing so guarantees that people will get confused about *what* is
actually being identified (we have ample proof of this with namespaces
fiasco, which confuses the crap out of everyone).

IMO, it would be far more useful to publish some guidance which says:

"If two people independently use the same URI, absent of any other
contextual information, they should be able to have a reasonable degree
of confidence that they are identifying the same resource.  URI naming
schemes should be selected carefully, and people are strongly
discouraged from naming resources in a manner that works against this
confidence."

That is completely adequate, IMO, and from that people can draw their
own conclusions about whether or not such reasonable confidence of
"sameness" is provided by using http: identifiers to identify a beach.
Follow-Ups:
- Re: [xml-dev] URIs harmful
  - From: "Jonathan Borden" <jborden@attbi.com>
- RE: [xml-dev] URIs harmful
  - From: Bill de hÓra <bill.dehora@propylon.com>
Prev by Date: RE: [xml-dev] URIs harmful
Next by Date: Re: [xml-dev] DNS based URIs that don't imply access method
Previous by thread: RE: [xml-dev] URIs harmful
Next by thread: RE: [xml-dev] URIs harmful
Index(es):
- Date
- Thread