OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: URI resolver was Re: RDDL and XML Schemas Proposed Recommendation



On Sun, Mar 25, 2001 at 02:33:38PM +1000, Justin Couch wrote:
> Jonathan Borden wrote:
> > > This exists - the II URI resolver request. Is this URI equal/equivalent
> > > to this other URI I've given you. Allows you to map URLs to URNs to
> > > check equivalency.
> > 
> > When you say 'exists': Is this something I can easily use and people can
> > easily setup?
> 
> Potentially easy to setup, but to make this easy to explain, I think I
> need to go back over how URIs work.
> 
> A URI is an identifier. All it does is say "Here is something that I
> want to name". That name may or may not exist in the real world. For the
> most part it is just a text string in memory. To find out what that
> identifier actually points to we need to resolve it. This process
> involves interpreting what the string means, stripping it into some
> component parts and then using those to decide 
> 
> a) How to look something up
> b) What we want to look up for that identifier
> c) Once we have looked it up, what protocol is needed to acquire it
> d) Interpreting the raw content that was sent back to us
> 
> The first step is called resolving. We need to ask "the system" to find
> out what this identifier is so that we can do something with it. Inside
> the system, it must ask around to find out who is willing to resolve the
> string. This is termed the Resolver Discovery Service (RDS). [1] Once a
> resolver is found we then pass it the string and also tell it exactly
> how we want to resolve it. That is, we can ask it to supply us the
> concrete object, or a reference to another equivalent resource or to
> check if it is the same as some other resource.

Due to the fact that the original Resolver Discovery System has
begun to be used outside of simple URI Resolution we have since
renamed it to the Dynamic Delegation Discovery System. (Its being
used to turn telephone numbers into URIs and for discovering
transport protocols for transport agnostic object exchange.)
The documents describing the service are due to be published
as RFCs soon. You can read their draft form at:
http://www.ietf.org/internet-drafts/draft-ietf-urn-ddds-04.txt
http://www.ietf.org/internet-drafts/draft-ietf-urn-uri-res-ddds-03.txt
http://www.ietf.org/internet-drafts/draft-ietf-urn-dns-ddds-database-04.txt

> One of the requirements of the RDS RFC is that a resolver is not
> required to implement all request types. So you can set up a resolver
> that only turns URNs into URLs, or one that only resolves URLs into hard
> object instances. There are actually 9 different resolution
> possibilities provided by RFC 2168.

Yep. You can get as complex or as simple as your application needs.
You also have the situation where some URI schemes are simply not
resolvable in a globally authoritative sense. GUIDs are a good
example since its such a flat space you can't really build an
authoritative and exhaustive database of them. 

> Inside your web browser, there is already this system happening. A URL
> is a type of URI. The browser effectively does the resolving step of
> turning the URL you have typed into the location area into an object
> instance. Although typically it isn't implemented in a full resolver
> mode (have a look at java.net.URL for a good example), the basic
> principles are followed. Decide what the string is, strip it to pieces,
> contact the server with a given protocol and then interpret the bytes
> coming back. This is a single service - I2R (Identifier to Resource).
> The other 8 services are not implemented in your typical web browser.

In some sense you could interpret the 30x series of HTTP responses
as a I2L service. Its a shame we never implemented I2LS (one URI to
a list of equivalent ones) in HTTP....

> Now, getting back to the original point - is it hard to set on up? If
> you know what you are doing, and understand URI resolution mechanism -
> no. It is as simple as adding two more records to your BIND resource
> files and restarting named. The hard part is making sure you have the
> infrastructure in place to deal with the requests. 
> 
> Firstly, your end user application has to know how to deal with generic
> URIs. As I mentioned above, a typical web browser does not allow
> plugging in of generic URI resolver DLLs. 

Actually IE does. But it means you have to insert yourself in between
IE and its URI plugin interface which means you have to duplicate
everything Microsoft is doing. It isn't trivial. ;-)

> It requires some trickery to
> get it to work properly. Think of how you might, from the GUI
> perspective, ask it whether this URI is equivalent to that URI. For
> custom application code you can use a thirdparty library. Some work has
> been done on a generic URI resolver library for libc but AFAIK it hasn't
> progressed much, if at all in the last few years. For the Java users,
> you can play with my library [2].

We are releasing a URN resolver plugin for IE sometime in May that
does this step for asking for locations when given the 'urn:' URI
scheme. It can be generalized to generic URI resolution but its still
a question of where you put it into the browser. Putting it into a parser
is rather easy since its probably something like a fancy entity resolver.

> Secondly you have to have some form of database set up to enable the
> underlying requests to be resolved. Depending on the size of your
> application, this can be a big task. Usually this is mapped to lots of
> SQL requests. For example, resolving the URN  urn:isbn:0071348131 
> requires someone to have a ISBN database that is publically available to
> check for information in. If you want equivalence checking then you will
> probably need a good history of old stuff.

Yep. The complexity depends on the set of URIs you are trying to 
provide information for. For 'http:' URIs the 'database' can be
your document store and your metadata can come from any number
of vocabularies such as RDDL or some of the stuff from RDF...

> Next you need to put in the resolver front end. This could be DNS
> (requires addition of SVR and NAPTR records) or if you have a much
> simpler application then something like THTTP that does resolving over
> HTTP requests would work fine (my library has a THTTP RDS in it and
> there are couple of libs floating around for it. IIRC Apache has one in
> its contrib area).
> 
> > Things like RDF allow you to make statements "about" resources without
> > needing to resolve the URI. 
> 
> Well that is an invalid statement. It is easy to prove it to be false
> because you may say something and then have the resolver tell you it is
> completely different. That is, these can only strictly be limited to a
> collection of hints until an authorative answer is given by actually
> resolving a resource. For example, that URN above, your RDF says that it
> is a comic book. If you resolved it you would find it is a technical
> book about Java network programming.

And this is an important point. The URI Resolution application finds
_authoritative_ information about the URI and/or its Resource. If
you want to find information that is simply someone elses opinion about
the URI then that is what a few of us are calling Contextualization.
Its the resolution and/or use of metadata about a URI within some
non-authoritative context. But that is a different story....

> > The resource that a URI identifies is distinct
> > from the "entity" that the URI may resolve to from time to time (the entity
> > is not guarenteed to be constant e.g. a stock price --very unfortunately
> > these days).
> 
> A URI can specify a live resource. There is nothing saying that it
> cannot be returning a continuous stream. Just set your returned MIME
> type to user to indicate the correct information. URIs are not required
> to only resolve to text/plain, they can quiet easily be video/mpeg or
> some other x- scheme. If the URI resolves to an instantaneous stock
> price, then find a better resolver. Find one that returns you a
> reference to a stock ticker content type. The resolver I2Xs schemes
> allow you to resolve one identifier to multiple sources. To not do so is
> a failure on the part of the application writer and service writer to
> provide a reasonable service. What if your video server only returned
> you one frame of video each time you queried it? That's not particularly
> useful is it. The identifier is resolved to a streamed, live, resource.
> No reason why your stock price should be treated any differently.

Yep. Its the old story of does the URI identify the thing in the box
or the box itself. The answer is that you should have three URIs: one
for the box, one for whatever is currently in the box. and possibly
several for the thing that is the contents. If your URI identifies
an abstract thing that can change its representation then you have
to deal with that in your application or pick a different URI.

As a point of information: this entire space is something that the 
W3C URI Interest Group is looking at as an area of joint work with
the IETF. The goal is to look at standardizing some vocabularies and
services more concretely so that its easier to describe the entire
space as a single infrastructure service. I think our deadline is
the end of next month so you should be seeing something shortly....

-MM

-- 
--------------------------------------------------------------------------------
Michael Mealling	|      Vote Libertarian!       | www.rwhois.net/michael
Sr. Research Engineer   |   www.ga.lp.org/gwinnett     | ICQ#:         14198821
Network Solutions	|          www.lp.org          |  michaelm@netsol.com