   Re: Ontologies

  • From: Ronald Bourret <rpbourret@rpbourret.com>
  • To: "W. E. Perry" <wperry@fiduciary.com>
  • Date: Wed, 20 Dec 2000 11:15:26 -0800

Could you explain at a technical level what is happening? It sounds like
a node contains code to extract whatever it needs from the input, but
that this code must be written explicitly. That is:

   "The point is that in all these cases there is processing
    code--unique to each node--to be written, and nothing will
    magically obviate that chore."

Is this correct?

If so, why doesn't this contradict your earlier statement that:

   "It is just as futile to try solving this problem by creating
    a separate API for each class of node-to-node pair interactions."

since the node would need to contain different processing code for each
node from which it could possibly extract useful data to use as input?

-- Ron

"W. E. Perry" wrote:
> Martin Bryan wrote:
> > Walter Perry wrote:
> >
> > >In fact, aren't we ready to go the whole
> > way and acknowledge that the ontologies addressable at the nodes of a
> > semantic
> > web must in fact *be* executable against the various inputs we put to them?
> >
> > Not unless we agree a fixed API for returning node data in a given language.
> No. As I argue regularly at length (and will spare you a full version of here),
> the scalability and adaptability of a semantic web (or whatever we call this
> network of autonomous nodes) depends in the first instance on each node's
> ability to handle a significant variation of input presentation. Considered from
> its own point of view, each node implements one or more processes. Within its
> autonomous black box, that node knows what input it requires at the threshold of
> executing a process and it knows the output which successful completion of that
> process produces. From the node's viewpoint, the forms of both its input and
> output are semantically fixed. That is, both input and output necessarily
> exhibit some structure, and such structure necessarily implies some particular
> understanding of the internal relationships of its constituent data and some
> epistemological perspective on that body of data as a whole. That is, the form
> of the data, together with such incidentals as what is included or not, conveys
> significant semantics. A 'fixed API', in exhibiting such a structure,
> necessarily conveys such semantics of its own. Those API semantics, however, are
> unlikely to be the 'native' semantic understanding which both sender and
> receiver bring to the data that is the substance of any specific exchange
> between them. In a sufficiently complex network, an agreed fixed API is unlikely
> to represent fully and accurately the semantic understanding of either party.
> This has immediate and devastating consequences for the scalability and
> adaptability of the system as a whole.
> If there is a single API across the entire semantic web, then as each node grows
> more specialized and the interactions among them more complex, an increasing
> percentage of each node's work will be wasted on conversions in and out of the
> API with each process it executes. The maintainer or designer of function
> implementation at each node will face a similarly increasing percentage of
> effort squandered in figuring out how to get from the data presented to the data
> required in building the ever more complex functionality required at each
> specialized node. This problem should look very familiar:  those of us who have
> been harnessing processes distributed across different enterprises, national
> practices and regulations, time zones, hardware and operating system platforms,
> and interchange protocols have already lived through three generations of
> proposed solutions just since 1980. Need I point out that it is just this
> problem which the semantic web proposes to solve with nodes that understand the
> substance of their interaction at a semantic, rather than a purely syntactic,
> level (all of it based on the underlying ability of markup to assure
> self-describing data)? Fine; then get on with it, but don't introduce the
> eventually fatal bottleneck of conversion in every case through a static
> intermediate level.
> It is just as futile to try solving this problem by creating a separate API for
> each class of node-to-node pair interactions. This is the flaw in the agreed
> vertical market data vocabularies (ESteel, FpML, etc.--more than 2000 of them
> when I last gave up trying to do an accurate count, as well as to discover
> whether even one of them operated from a different premise). To the extent that
> a particular node is truly expert--that is, that within the semantic web the
> introduction or application of its unique ontology facilitates a semantic
> outcome more elaborate, more nuanced, or in some other way more than the common
> denominator of the vertical market vocabulary--that node requires a richer
> semantic vocabulary to express the output of its process. To my mind, this is
> precisely why we use *extensible* markup as our basic syntax. For any node to
> make use of the particular expertise of another surely means using primarily
> what is unique to the output of that node. This means using as an input what is
> outside the standard API in the output of that node. So, in order both to have
> access to the particular expertise of other nodes, and also to avoid in the
> general case the proliferating waste of constant conversions into and out of
> standard APIs, why don't we try a different premise entirely:  that it is the
> responsibility of each node to instantiate for its own purposes (and therefore
> in its unique native semantics) whatever it might take as input from the unique
> output of another node.
> Not only does this put the solution to the problem in the right place
> philosophically, but as a practical matter it correctly factors the process
> design, implementation and maintenance tasks incumbent on the expertise of each
> node. The node's expectations for input data necessarily reflect its unique
> understanding of the process it is to perform, the unique ontological structure
> it instantiates. Part of what is unique about that node is the process by which
> it gets data from a form in currency outside of it into the unique internal form
> which directly expresses its epistemology. To share that form with an upstream
> data provider might well mean sharing, for example, the algorithmic processes by
> which that form is achieved, which may be the very raison d'etre of this node.
> Why undo the useful factoring of a problem which incorporates this node into its
> solution? It may also be that the data provider's understanding of its output is
> fundamentally at odds with this node's understanding of that same data as input.
> This touches directly upon the very nature of reuse. Surely one crucial premise
> of a semantic web is that different nodes may take the output product of any one
> and do vastly different things with it, predicated on entirely different notions
> of how, within their individual ontologies, the product of that node should be
> understood. This is why it is a semantic web and not a semantic pipeline, though
> the thread of any sequence of processes within it might be most easily
> understood as a unidimensional pipe. And, finally, the practical business of
> instantiating input data afresh to fit the ontology of each node just isn't that
> difficult in most cases. The node already knows what it needs, so the 'output'
> side of the 'transform' (both contentious terms here, I know) is already fixed.
> To the extent it is looking for some component of the input which is unique to
> the data source node, behavior which utilitizes that unique element has already
> been built into this receiver and with it, presumably, some hints to identify
> what it is looking for. To the extent that it has some previous experience with
> this data source node, this receiver has previous familiarity with the forms of
> data which it has received and with how it has previously instantiated that data
> for its own purposes. Finally, even where input data may appear at first
> incomprehensible, it is often possible for the receiver to instantiate it by
> brute force into something usable. The receiver knows, after all, the data
> semantics which its own processes require. If by working the permutations of
> instantiating what it receives into possible versions of what it needs, and then
> by performing its unique processes upon one of those versions, the receiver is
> able to achieve a useful--for it--outcome, no one else has the standing to say
> that this was not a correct instantiation of the input data. The point is that
> in all these cases there is processing code--unique to each node--to be written,
> and nothing will magically obviate that chore. This is why I agreed so
> wholehearted with Jonathan Borden's analogy of the specification of an ontology
> to source code. If the 'ontological node' is going to *do* anything useful then
> yes, of course, executable code is what is required.
> >
> >
> > >Let us submit the same body
> > > of input simultaneously to various different diagnostic
> > methodologies--each
> > > expressed as an ontology to which we can form a nexus at an addressable
> > > node--and, provided that we can retrieve or address the output of each, we
> > can
> > > ignore the particulars of what happens in those opaque boxes.
> >
> > Works OK for short term data, but try looking at medical records over the 90
> > year life of a patient on this basis and you will run into problems. Even
> > Jonathon will admit that drugs get reclassified in their life-time. You need
> > to know the classification at the time they were administered, not the
> > classification today. Opium was de rigour in our grandparents time. Do you
> > want it adminstered to your grandchildren?
> As I hope I have very nearly exhaustively covered above (and I said that this
> would not be the version in full!--sorry), this is simply a task fulfilled by
> the proper factoring of the problem at each node, based upon the unique
> expertise of that node. In the example case, if a constellation of symptoms is
> submitted to multiple diagnostic nodes (good practice), there must then be a
> node which effectively proxies for this particular patient, or perhaps for this
> particular patient as treated under the philosophy of his trusted primary care
> physician, which then evaluates the various diagnoses and prescriptives from the
> unique perspective which none of the diagnostic nodes can have--that of the
> unique individual who will implement the next step of the process.
> Respectfully,
> Walter Perry

