[
Lists Home |
Date Index |
Thread Index
]
On Tuesday 29 October 2002 16:33, Paul Prescod wrote:
> Alaric Snell wrote:
> >...
> >
> > But you can never escape the old problem of the two armies; if you send a
> > single request packet, never hear anything back, and retransmit for days
> > without ever hearing back, you still don't know if the remote server got
> > the packet or not.
>
> That's a brutal problem and it is a perfect example why it would be
> incredibly wasteful to write code designed for the local case as if it
> were designed for use over a network...which is what it sounds to me
> like you were proposing when you started talking about trappable SEGV
> signals. It simply doesn't make sense to write local code as if it were
> remote code nor is vice versa reasonable.
The same problem occurs with local code. You can execute a local procedure
call that never returns, and not know if it's completed or not. Power fails
sometimes :-)
This is why we have transactions and rollback and all that.
You don't need to always explicitly deal with this because often the default
behaviour - roll over and die with an error message or a puff of smoke - is
OK. But when you want to do things where that becomes a problem you need to
use transactions, requiring bidirectional communication to agree on something
getting done. There are still cases where two-phase commit or even comitting
to a local disk can fail, but the protocols reduce the chances to one in
trillions, and if that's a problem you can go up to N-phase commit...
But it's all the same to the application code. All of this is an issue for
the systems software you're working on.
> > Programming languages that have exceptions don't need to hide the network
> > when doing RPC. The RPC calls can just throw an extra RemoteException if
> > there's a networking problem, and bob's your uncle; we're exposing the
> > potential unreliability of the network, and nobody every said that any
> > method call has to be particularly fast anyway so we've nothing to hide
> > in the latency department!
>
> The question isn't whether the reliability is exposed through the API.
> The question is whether the application and API are *architected* around
> the potential for failure and latency. If they are, then they will be
> too much of a hassle for local use. This is all documented carefully in
> "A Note On Distributed Computing".
>
> You can certainly build decent protocols on top of RPC -- but only by
> sacrificing the very thing that made RPC so nice: the fact that it makes
> remote procedure calls look like local ones! An HTTP defined on top of
> RPC would still be HTTP. But HTTP API calls do not look much like
> procedure calls.
Depends on your API! int result = uri.doPost ({name: "Alaric Snell"})...
Hava programs are architected around the potential for failure, they have
exceptions. As for latency... well, local disks and awkwardly long
computations or awkwardly large files or processors slower than the ones the
software was written for are as bad a source of latency as networks, and the
fact that it's not handled so well in non-distributed apps is something that
I'd moan about anyway. Windows machines really do grind when I throw
multi-gigabyte CSV files into Excel; I don't mind that it takes forever, but
it'd be nice if I could still use the machine for other things while it's at
it :-)
> But more to the point, HTTP is optimized for handling networking issues
> like lost messages and latency. It defines idempotent methods carefully
> (these help reliability). It defines cachable methods carefully (these
> help latency).
Yep. But that's not something orthogonal to RPC. There aren't enough RPC
systems that deal with these issues well, certainly, but there's nothing
stopping one being written; it's not a problem with RPC itself, just
implementations that were designed in a world with less globe-spanning
networking going on.
> > And as for the REST argument that 'there are only a few methods, GET and
> > POST and others'... I think it's wrong to say that GET and POST are
> > methods in the same sense that getName () and getAge () are; in HTTP you
> > would do GETs on seperate URLs for the name and the age, or GET a single
> > URL that returns both name and age. In no way has GET replaced getName
> > and getAge. HTTP's GET and POST and so on correspond more to an RPC
> > protocol's' INVOKE' operation than to the application-level getName ()
> > and getAge ().
>
> You can surround the issue with logical sophistry but it doesn't change
> the core, central fact: any "application" in the world can invoke GET on
> any URI without previous negotiation.
Likewise with RPC applications!
Even in scummy Java RMI, you can write a tool like:
...regexps or whatever to split out a URI like "rmi:<registry name>.<method>"
taken from the command line...
Remote o = Naming.lookup (<registry name>);
Method m = o.getClass ().getMethod (<method>,new Object[0]);
Object result = m.invoke (o, new Object[0]);
System.out.println (result.toString ());
That tool can be compiled up and then used to call any zero argument getter
method anywhere. If you want to get into ones with arguments then there's a
UI issue of setting up arbitrary Java objects for parameters, but it's still
doable. You should really call result.toString () in a sandbox, too, since
it'd be arbitrary code, but I'm leaving that out for the sake of ease.
Et voila - an RMI browser!
> There are dozens of standards and
> tools with support for that built in, starting with HTML, through XLink,
> through the semantic web technologies, through XSLT and XPointer,
> through the stylesheet PI, through Microsoft Office (including, I'd
> wager XDocs), through caches and browsers.
Pretty much every Unix system in the world has ONC RPC in it, and every
"Java-enabled platform" can do RMI, et cetera.
> That will never be true for getName and getAge. That makes GET
> inherently superior to getName and getAge.
But GET is not on the same level as getName () and getAge ().
Under HTTP you get GET and PUT and POST and OPTIONS and HEAD and all that.
Under RMI protocols you have INVOKE and AUTHENTICATE and PING and all that.
One does not say "I will GET amazon" or "I will GET a book from amazon"; one
GETs a URL that returns some aspect of information about a book.
With RMI, one INVOKEs a method that returns some aspect of information about
a book.
That's all there is to it! RMI protocols, like HTTP, have a few operations to
perform meta-actions like OPTIONS and HEAD. RMI has about three different
forms of invocation, GET / POST / PUT (although PUT is potentially redundant
with POST around) while most RMI protocols just do one; but as I've said
before I've found it trivial to add metainformation to RPC interfaces
specifying that methods are idempotent or cacheable, which the runtime uses
to do what HTTP does with GET and POST.
So GET is not comparable with getName (). Under HTTP, the application-level
action you're performing and the object you're performing it on are all
stuffed into the URL. With RPC, you have two explicit field in the request
packet for this, that's all. Your application does not have to have getName
() hard coded into it any more than your HTTP client needs to have a URL
encoded into it - although many Web browsers these days will convert "amazon"
into "http://www.amazon.com/"; just as an RPC system browser used for
debugging would, when presented with an object identifier but not told which
method to invoke, invoke a default method such as "getInterface ()" to get a
description of the object's interface (just like WSDL). A user interface
system I designed over RPC, when presented with an object identifier, calls
"getUserInterface ()" first to try to find an explicitly-defined UI for the
object and if that fails because the method is not implemented then calls
"getInterface ()" to get a raw list of methods and presents the user with a
list (it's a bit like an HTTP server giving you a directory listing in the
absence of index.html).
> Even if one does not buy the
> entire REST argument, it amazes me that there are still people out there
> arguing that it is better to segment the namespace rather than have a
> single unified namespace for retrieving all information.
HTTP: "http://<server>/<path>/<file>?<args>"
RMI: "//<server>/<object>.<methodname>(<args>)"
Still four components in each name; about equally segmented if you ask me :-)
> You talk about
> reinventing wheels and learning from the past. Surely the ONE THING we
> learned from the massive success of the Web is that there should be a
> single namespace.
Not quite true... there's confusing with URIs being URNs or URLs, perhaps
there's two namespaces that shouldn't have been merged.
One namespace for locating resources was invented long before the Web, and
the Web doesn't even do too great a job of this universal namespace thing;
they just took a lot of namespaces and federated them into URLs with the
scheme part. You can't rely on much more than "http:" being supported in any
given situation so it's a fragmented namespace. This was presumably done to
enable the http to lever itself over ftp and gopher by interoperating with
them on a level playing field, but now it's less applicable.
In the ISO OSI world you have X.500 for URLs and OIDs for URNs.
X.500 is cursed with an ugly name syntax, sadly, for it's a great resource
locating namespace arrangement with a single protocol to implement, so none
of this "which schemes does this implementation support" stuff.
OIDs are, somewhat controversially, purely numeric - this decision was taken
to prevent the various problems with names that can be trademarked to nastily
restrict interoperability, and with names that have to be changed because the
company involved got bought or whatever. Although this makes them less
friendly for programmers, it avoids a lot of nasty pitfalls.
> Paul Prescod
ABS
--
A city is like a large, complex, rabbit
- ARP
|