[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: URI resolver was Re: RDDL and XML Schemas Proposed Recommendation
- From: Justin Couch <firstname.lastname@example.org>
- To: Jonathan Borden <email@example.com>
- Date: Sun, 25 Mar 2001 14:33:38 +1000
Jonathan Borden wrote:
> > This exists - the II URI resolver request. Is this URI equal/equivalent
> > to this other URI I've given you. Allows you to map URLs to URNs to
> > check equivalency.
> When you say 'exists': Is this something I can easily use and people can
> easily setup?
Potentially easy to setup, but to make this easy to explain, I think I
need to go back over how URIs work.
A URI is an identifier. All it does is say "Here is something that I
want to name". That name may or may not exist in the real world. For the
most part it is just a text string in memory. To find out what that
identifier actually points to we need to resolve it. This process
involves interpreting what the string means, stripping it into some
component parts and then using those to decide
a) How to look something up
b) What we want to look up for that identifier
c) Once we have looked it up, what protocol is needed to acquire it
d) Interpreting the raw content that was sent back to us
The first step is called resolving. We need to ask "the system" to find
out what this identifier is so that we can do something with it. Inside
the system, it must ask around to find out who is willing to resolve the
string. This is termed the Resolver Discovery Service (RDS).  Once a
resolver is found we then pass it the string and also tell it exactly
how we want to resolve it. That is, we can ask it to supply us the
concrete object, or a reference to another equivalent resource or to
check if it is the same as some other resource.
One of the requirements of the RDS RFC is that a resolver is not
required to implement all request types. So you can set up a resolver
that only turns URNs into URLs, or one that only resolves URLs into hard
object instances. There are actually 9 different resolution
possibilities provided by RFC 2168.
Inside your web browser, there is already this system happening. A URL
is a type of URI. The browser effectively does the resolving step of
turning the URL you have typed into the location area into an object
instance. Although typically it isn't implemented in a full resolver
mode (have a look at java.net.URL for a good example), the basic
principles are followed. Decide what the string is, strip it to pieces,
contact the server with a given protocol and then interpret the bytes
coming back. This is a single service - I2R (Identifier to Resource).
The other 8 services are not implemented in your typical web browser.
Now, getting back to the original point - is it hard to set on up? If
you know what you are doing, and understand URI resolution mechanism -
no. It is as simple as adding two more records to your BIND resource
files and restarting named. The hard part is making sure you have the
infrastructure in place to deal with the requests.
Firstly, your end user application has to know how to deal with generic
URIs. As I mentioned above, a typical web browser does not allow
plugging in of generic URI resolver DLLs. It requires some trickery to
get it to work properly. Think of how you might, from the GUI
perspective, ask it whether this URI is equivalent to that URI. For
custom application code you can use a thirdparty library. Some work has
been done on a generic URI resolver library for libc but AFAIK it hasn't
progressed much, if at all in the last few years. For the Java users,
you can play with my library .
Secondly you have to have some form of database set up to enable the
underlying requests to be resolved. Depending on the size of your
application, this can be a big task. Usually this is mapped to lots of
SQL requests. For example, resolving the URN urn:isbn:0071348131
requires someone to have a ISBN database that is publically available to
check for information in. If you want equivalence checking then you will
probably need a good history of old stuff.
Next you need to put in the resolver front end. This could be DNS
(requires addition of SVR and NAPTR records) or if you have a much
simpler application then something like THTTP that does resolving over
HTTP requests would work fine (my library has a THTTP RDS in it and
there are couple of libs floating around for it. IIRC Apache has one in
its contrib area).
> Things like RDF allow you to make statements "about" resources without
> needing to resolve the URI.
Well that is an invalid statement. It is easy to prove it to be false
because you may say something and then have the resolver tell you it is
completely different. That is, these can only strictly be limited to a
collection of hints until an authorative answer is given by actually
resolving a resource. For example, that URN above, your RDF says that it
is a comic book. If you resolved it you would find it is a technical
book about Java network programming.
> The resource that a URI identifies is distinct
> from the "entity" that the URI may resolve to from time to time (the entity
> is not guarenteed to be constant e.g. a stock price --very unfortunately
> these days).
A URI can specify a live resource. There is nothing saying that it
cannot be returning a continuous stream. Just set your returned MIME
type to user to indicate the correct information. URIs are not required
to only resolve to text/plain, they can quiet easily be video/mpeg or
some other x- scheme. If the URI resolves to an instantaneous stock
price, then find a better resolver. Find one that returns you a
reference to a stock ticker content type. The resolver I2Xs schemes
allow you to resolve one identifier to multiple sources. To not do so is
a failure on the part of the application writer and service writer to
provide a reasonable service. What if your video server only returned
you one frame of video each time you queried it? That's not particularly
useful is it. The identifier is resolved to a streamed, live, resource.
No reason why your stock price should be treated any differently.
 The typical RDS implementation used by URI users only tend to use
one RDS supplier - for example DNS. I've been doing work on and off over
the last couple of years specifying how to use multiple RDS systems in a
system. This acts in the same way that namelookups might use DNS, NIS or
/etc/hosts or any combination of these. My URI libraries are the only
ones capable of dealing with multiple RDS suppliers AFAIK.
 OSS (LGPL) RFC compliant resolver library for all URI types -
Justin Couch http://www.vlc.com.au/~justin/
Freelance Java Consultant http://www.yumetech.com/
Author, Java 3D FAQ Maintainer http://www.j3d.org/
"Humanism is dead. Animals think, feel; so do machines now.
Neither man nor woman is the measure of all things. Every organism
processes data according to its domain, its environment; you, with
all your brains, would be useless in a mouse's universe..."
- Greg Bear, Slant