[
Lists Home |
Date Index |
Thread Index
]
- From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
- To: "Steven R. Newcomb" <srn@coolheads.com>
- Date: Wed, 25 Oct 2000 16:09:14 -0500
I deleted the schema comments address as this is probably
off topic for that list. I deleted elharo's address as
he probably gets XML-DEV. I leave Steve's address as
like me, he probably likes lead time to formulate replies.
From: Steven R. Newcomb [mailto:srn@coolheads.com]
>Consider the interchange form (an XML message) of a purchase order.
>It would be a bad idea to include a total amount to be paid, since an
>explicit total would be redundant.
Right. One of the first rules of database schema design is to
avoid creating fields for information which can be derived by
calculation. So the same principle applies. Given that, there
is at least one point of similarity and perhaps more to contribute
to the category name, schema.
>If somebody tweaked the
>interchange form, the total would be inconsistent with the rest of the
>message, and there would probably be no easy way for the recipient to
>determine which information is invalid. In general, then, it's a bad
>idea to include redundant information in interchange messages, not
>just because it uses bandwidth, but more importantly because it is
>very likely to cause ambiguity.
Yes. However, let's change that to a price list. For that I must
send the pricing information. Do I store that? Perhaps, but more
likely, I compute that Just In Time and send it. However, once
sent, the price is now part of the Bid cycle and must persist until
that cycle and any dependent cycle closes. Because this may have
temporal issues, eg, the process of evaluating the bid is suspended
while someone is on sick leave, whatever, the cycle is suspended,
the state "dehydrated" and persisted, then on restart, "rehydrated"
with the same information. This is the principle of long transactions.
So "ready to run" is dependent on the goal of the process and that
goal is supported by a means to create a data object with state.
That I do or do not need a schema to express that is arguable, but
given the need for some process to know on rehydration that the
price is integer, real or currency (perhaps indicating a currency
conversion rule must be called), the schema is useful.
>Consider the form of the purchase order when it is "ready to run" --
>when an API is provided to the information it contains. It's very
>reasonable to provide such an API with a "total()" method. Redundancy
>in APIs is good; APIs are supposed to be convenient to use. total()
>gives access to an "emergent property" (as opposed to an explicit
>syntactic property) of the information set found in purchase orders.
>Of course, while total() makes sense for purchase orders, it doesn't
>apply to many other kinds of XML messages.
Yes. But let's stop and look at the term, emergent property.
Using http://www.goertzel.org/papers/catpap.html definition:
"One may say that a process is emergent between X and Y to the extent
that it is a pattern in the "union" of X and Y but not in either X or Y
individually. Consider for instance the part of this sentence preceding
the semicolon, and the part of this sentence following the semicolon;
the meaning of the sentence does not exist in either part alone, but
only emergently in the juxtaposition of the two parts."
A total emerges because it there are at least two prices. That we
apply total() at some point in a process depends on the goal of that
process.
I'm snipping the grove example because I think it is understood
herein. Let me ask that if I can take a schema and by transformation
create the API (standard practice given virtual methods),
do these have the same grove? If so, that grove can exist in the
abstract but other than creating a mapping from the API syntactical
form to the XML data object form, and adding the method names (eg,
get and set), I have only mapped invariant names. For the grove
to be more than that, it must also express some semantic about
the handling and that is stored in the transform script. So are
get and set part of the separate grove you mention, the semantic grove
that expresses the intent of the transformation?
>I find the example of topic map processing much more compelling. The
>syntactic components of a topic map document are not, and they can
>never be, fully indicative of their own significance. They can only
>be fully understood in terms of their connections to many other things
>whose syntactic whereabouts are necessarily arbitrary.
Let me take Piaget by example. Given a learning system, there are
multiple layers of meaning (semantic) whose interconnections once
understood, enable higher level operations. One might build a
set of network layers (semantic layers)
in which acquisition of understanding each lower layer
enables understanding of those connections (putting
aside reliable addressing for the moment as an implementation
consideration)
From the Thompson paper
http://www.ph.surrey.ac.uk/~phs1it/papers/layer7.htm
Network Relations of
5 meta-theories, paradigms
4 plans, models, formalisms
3 classes, series, numbers
2 events, single relations, sentences
1 objects
0 images, motor movements
>The *whole* topic map document must be understood -- processed -- before
the
>significance of any of it can be fully and reliably understood.
I don't quite understand that. It is as if I have to
read the encyclopedia before I can say "Look!". It seems
to me that groves or semantic nets would be created in
layers such as shown above (a Piaget child learning model)
enabling different component/APIs to service each layer.
>The reason you create topic mapdocuments is to allow these
>semantic net-like things to beinterchanged and merged with one
>another by their end users and by people who wish to add
>more value to them in various ways.
Yes and a standard for creating these is needed to take advantage
of the network effect.
>The nets don't and can't resemble the interchange documents, because of
their
>own very highly interconnected and interdependent nature, and because
>of the fact that the nature of an interchangeable document is quite
>different from that of a semantic net. An interchangeable document is
>nothing more or less than a sequence of characters.
Yes and this is noted in the literature on connectionist vs rules
based inferencing. In effect, rules are needed to make connections
among the layers so that a change can transit a layer (well-performed).
This is an observation model, not just declarative. That is, I
instrument the layer beneath, watch for changes, and on recognizing a
change based on a known pattern, I change my own layer. That is
why control layers are said to be emergent. Your example of the
price total is a good one. I may in fact, never watch for changes
but only schedule a JIT process named total(). The interesting
case is one in which I get a change out of range for some
value. This usually indicates a hidden coupler and the need
for a remediation transaction (Try on error).
However, I am concerned that topic maps seem to require such
completeness that they cannot be considered modular. How does one
reuse "highly interconnected and interdependent" information
unless rules can be applied to create different contexts.
Topic maps seem to be very Platonic and rules, very Aristotelian.
Don't these need to be combined to use topic maps for inferencing?
>So, I repeat what I said in my earlier note:
There is this common wisdom out there that the structure of
interchanged information should also be, in effect, the API to that
same information. But, in fact, it's only true for a simple subset
of the kinds of information that need to be interchanged, and to
which APIs must be provided.
> Len, does this speak to what you were saying about layered systems?
Yes I believe it does. Thank you, Doctor! Does the Piagetian
model resonate with your understanding of the application of
semantic nets to learning systems in which patterns can emerge?
len
|