OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Modularity (was: Linkbases,Topic Maps,and RDF Knowledge Bases -- help me understand, please)

> > > > For example, in XLink <URL:
> > > > http://www.foo.com/doc.xml#bar > and <URL:
> > > > http://www.foo.com/doc.xml#xpointer(id('bar'))
> > > > mean the same thing, but in RDF they're different.
> > >
> > > I really wish that, on the Web and everywhere else,
> > > addressing (e.g., URIs) were entirely functionally
> > > distinct from the many applications of addressing
> > > (e.g. XLink and RDF).

[John Cowan:]

> Well, but remember the price of making these the same
> in XLink.  XLink systems can't answer a simple
> question like "Do you and I link to the same resource?"
> We might, but if we specify it using distinct names,
> as above, we aren't going to know except by trying.
> In RDF, the answer is simple to find and does not
> require actually fetching the resource.

Actually, fetching the resources does not, in the general
case, permit us to determine that the resources have the
same identity.  Regardless of whether the returned data
match, we cannot be certain that the resources have the same
*identity*.  For example, if the resources are files, and if
their contents happen to match, they could simply be copies
of one another, or they could both simply be empty.  If the
data that are returned from the two requests to what is
really one and the same addressable resource do *not* match,
it could be in the nature of that resource that no two
fetches will ever be the same; one example is an atomic
clock at some observatory; another example is any Web site
that reports its own hit counter.  The clock is one and the
same information resource, even if you reference that clock
using ten different addressing expressions (such as ten
different URIs), and even though every time you resolve any
reference to the clock -- even using the same addressing
expression (URI) -- you get something different back from

Even though it may seem strange to be worried about whether
two different addresses reference one and the same
addressable resource, this is not as trivial an issue as one
might think.  The determination of the *identity* of a
resource -- the fact that its host server knows it to be a
single unique data source -- is very important for the
long-term future.  In an increasingly electronic
civilization, we must be able to invoke all kinds of
meanings reliably and unambiguously when we, our systems,
our organizations, and our agents speak to each other
electronically.  Addressable resources are necessarily the
things on which we must hang such meanings; only addressable
resources have this nice feature that they can both describe
themselves (return data), be described externally, be
addressable, and, most fundamentally, have identity.

The design of the topic maps paradigm recognizes this need,
and it provides a way forward.  In topic maps, addressable
resources serve as "binding points" for subjects (ideas,
meanings, etc.).  

  Note: In order to understand the rest of this discussion,
  please be informed that an addressable resource can either
  be a "subject indicator" -- it somehow describes the
  subject for which it serves as a binding point -- or it
  can be a "subject constituter" (aka "addressable
  subject"), in which case the resource itself *is* the
  subject.  (This distinction is basic topic map stuff.)

For example, if a single addressable resource serves as the
"subject constituter" of two topics, then those topics must
be merged into a single topic by the processing system,
because they have one and the same subject.  However, for
the same reason, if two topics have as their "subject
constituters" (aka "addressable subjects") two different
addressable resources, the two topics do *not* have the same
subject, and they must *not* be merged, even if the two
resources are exact copies of one another.

  Note: Similar concerns apply to "subject indicators", but
  it's simpler to have this discussion in terms of "subject
  constituters" because then we don't have to account for
  the additional complicating possibility of human
  interpretation of the content of the subject indicators.

Theoretically, at least, it doesn't matter how many
different ways a particular addressable resource is
addressed.  Computers attempting to process topic maps will
want to know whether, in fact, all these addressing
expressions refer to one and the same addressable resource.

Here's an easier case.  In this easy case, we have two
non-relative URIs strings, and when we compare the two
strings, they exactly match one another.  In this case, it's
safe to assume that the two URIs refer to one and the same
addressable resource.  In this case, the only reason we
would need to interact with the server of that resource is
to find out whether any addressable resource exists at the
address indicated by the matching URIs.  We would not need
to fetch the content of the resource to establish its

It will require a more sophisticated Web paradigm (such as
The Semantic Web?) to allow a server to report that




address one and the same piece of stored data (addressable
resource).  In the general case, there is no substitute for
the ability to get such reports from servers, and only
servers can give such reports.  In the general case, the
retrieval of any other data from the server is irrelevant to
the question of identity.

Until we can make these kinds of identity determinations on
the (Semantic?) Web, we'll have to make do with simple kinds
of URIs for our subject binding points.  It's a situation we
can live with for a while, but we'll want more power
eventually, and the need for this higher power level will
impose new requirements on servers.


Steven R. Newcomb, Consultant

voice: +1 972 359 8160
fax:   +1 972 359 0270

405 Flagler Court
Allen, Texas 75013-2821 USA