xml-dev - RE: Compressability (WAS RE: XML Schemas: the wrong name)

RE: Compressability (WAS RE: XML Schemas: the wrong name)
[ Lists Home | Date Index | Thread Index ]
From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
To: Martin Bryan <mtbryan@sgml.u-net.com>, xml-dev@xml.org
Date: Fri, 27 Oct 2000 09:41:14 -0500
Hi Martin: 

I read your document yesterday and what I can discern 
from one reading is that it is excellent, but like 
so many meta-level models, hard to apply on the 
first pass.  More examples, please.  

Topic maps, to me, have always appeared to be very fancy 
XLink aggregates that are easily implemented as 
treeviews.  In a sense, they do the job that a 
relational programmer might implement with functions on a 
treeview object that enable them to query master 
and child tables JIT.  Nodes is nodes.  That is not a 
perjorative comment, but one to point out how 
something is represented and implemented can diverge 
as long as the authoritative properties remain 
invariant.  I find I want to compare this to 
Dr. Newcomb's grove approach.  Both approaches 
lend themselves to the layered architecture and 
while we may describe layers as above and below, 
they are actually heterarchies where there are 
relationships within (domain) and between (ecotonal).

>I prefer Thompson's paradigm <-> model <-> classes <-> relations <->
objects
><-> images view.

Same here.  Clearly, we could model these all declaratively 
so the connectionist model holds in the abstract.  In practice, 
we need methods to make changes and the API is required.  I 
am still mystified why APIs seem to make the grove designer 
unhappy.   I would suggest (could be dead wrong) that the 
API is yetAnotherGrove providing the control properties.

>From the XML Schema point of view this would seem to equate to
>Schema Metamodel <-> schema <-> elements <-> substituion group <-> complex
>type <-> simple type

>I tried to do the same for ebXML, where I came up with

>Business Process Metamodel <-> Business Process  <-> Business Document <->
>Functional Unit <-> Aggregate Core Component <-> Basic Core Component

>Not absoulutely convinced this is the best representation for the ebXML
>material, but it looks like a starting point.

Ok so far.  In the Hatley-Pirbhai CASE model I adapted for the 
DTD in Beyond the Book Metaphor, a Process Specification (pspec) metamodel 
is used to describe a nested hierarchy of processes.  However, a Control 
Specification (cspec) metamodel is also created.  If we follow 
the concept that controls emerge from process relationships, then 
the model you show for ebXML would be followed by the control 
specification.  When I look at the XLang design, that appears to 
be a simplified form of the Control Spec.   One would route to 
discoverable interfaces and that routing might be described 
as the human-designed inter-layer relationships.

>> Note the emphasis or recognition of patterns to properly
>> name new instances, and the concern of the author that
>> most approaches have emphasized relations within a level
>> over relations between levels.

>What I found particularly interesting is that the relations between levels
>are best expressed as parallel operations, whereas the internal ones were
>more traditional arcs, which I consider to be serial in nature.

Certainly, and that matches the nested hierarchy with controls well.  In 
Beyond the Book, I described the view dimensions to enable the levels of 
operations or data flow to be linear (binding order dictates each step 
must be followed in sequence to be well-performed) or non-linear (process 
can be reordered opportunistically).   A view dimension migh better 
be described now in terms of process encapsulation, but the idea was 
to infer that at each level or view, the details of a lower levels 
performance are hidden.  This is how a Work Breakdown Structure sees 
tasks and it enforces the protocol of hierarchy of command.  It 
was also important to describe which events could be viewed between 
dimensions to enable a control protocol to emerge which remained 
stable in the face of processes that may be adaptive in process 
time within a layer.  IOW, object encapsulation: I don't care 
how you did it; I care if what you send is what I asked for, 
and in some cases where temporal sensitivity is an issue, when 
I ask for it.

>Look at the XML Schema for ISO 13250 Topic Maps that I published last night
>at http://www.diffuse.org/TopicMaps/schema.html. This illustrates nicely
the
>application of the XML Schema version of the model that I gave above.

Good reading and I thank you.

>David Megginson is convinced that he can do the same with RDF, but I still
>question the efficiency with which you can use RDF to describe parallel
>relationships (which is what topics actually describe).

I leave that to David to defend. :-)  I find topic maps easier to 
understand but that may just be that I've spent more time looking 
at the independent links of HyTime that became the extended links 
of XLink.  I used the nascent ilink in BTBM but didn't understand 
it then as well as I might have.  It's been ten years, three employers 
and a lot of committees since then. 

>>My intuition
>> is that just as these authors describe, RDF semantic models
>> must be able to be layered and relations between levels
>> are critical for compressability to work well.

>Agreed. The key thing to look at, as far as I am concerned, is the purpose
>of describing the relationships. 

Precisely.  One needs to know the mission of the next layer and in fact, 
that may be different sets of controls that can access the same resource 
(which the XLink enables perfectly).

>All the RDF examples I've seen are
>concerned with assigning metadata to a single resource, not describing the
>relationships between sets of resources. In library terms RDF is like
>writing out a single catalogue card describing a particular resource. Topic
>maps are more like building a subject catalogue from a set of catalogue
>cards. Topics identify a set of resources that share common
characteristics.

Again, we seem to come back to library metaphors.  These are ok but perhaps 
limiting in exploring relationships.  When I think of a relationship, I may
name 
it, but I think in terms of an interface.  As I told Charles, every place 
I see that link, I find myself wanting to put a function name there.  
A topic might be the index on the front of the box holding the cards, 
but I need to also describe the librarian's functions as well.

>> By compressability,
>> note the use of automated classification techniques to
>> create higher level representations that are more efficient
>> and the enabling of higher level operations based on the
>> outputs of the lower level processes.

>How this can be generalized in XML terms I am at present unclear. XPath
>would not seem to help, but it might be possible to do this by means of
XSLT
>processes associated with XML Schema substitution groups. Unfortunately
XSLT
>is not really built for parallel processing.

Yes, XSLT is what I had in mind too.  The literature is clear in citing 
transformations.   However, compressability refers to making a simpler 
pattern work for the more complex pattern (and in the simple case 
of downtranslation, it works just fine).  

The NeoCore guys are claiming XML is perfect for their engine, so 
maybe more study of their techniques can illuminate this.  
They scan the data, find the patterns, then iconize 
them.  Is the substitution group a kind of icon?  Your description below 
suggests to me that it is.  It becomes a symbol, sort of a hieroglyph, that
a 
trained agent can read.  Then perhaps, the iconization is a 
process by which the pattern is discovered, the icon is created, and 
an agent is created to handle it.  This seems to parallel the process 
by which one creates data objects and schema, then transforms these 
to virtual interfaces for which an object is created with methods to 
handle the data object.  Therefore, automated data mining detects a 
pattern, creates a schema, generates an interface and class for 
binding data matching the pattern, and iconizes that as a namespace URI.
For more complex interlayer relationships, it iconizes as a topic map.

That seems to be what SQLServer2000 could provide with OLAP techniques.
I am way out on a limb here, so comments from the MS designers would 
be appreciated.  I wonder if OLAP can generate topic maps.

>> If we are to use semantic networks well, QOS issues become
>> germane quickly.  A user of a high-level representation
>> should not have to rely on deep knowledge of the network
>> to wire it into an advanced workflow.  On the other hand,
>> unless the QOS numbers indicate a highly reliable component
>> operating on an authoritative, credible model, this is a
>> dicey thing to try without lots of testing.  Exhaustive
>> testing seems unreasonable to require, but are there
>> alternatives other than "ship it and wait for customer
>> feedback".

>This is where reusable core components fits into my view of the overall
>picture. If we have a reusable abstract component then users can apply
this,
>or restrictions or extensions thereof, to particular objects (XML
elements),
>which can then be treated as members of a substituion group that will be
>understood by a knowledge agent that is keyed to the relevant type model.

I agree.  It is only difficult to get folks to adopt pre-existing 
models and use them.  I see three problems everyday:

o Managers who refuse to believe that extra-company/industry agreements can 
be made or will cohere for long enough to be useful.  It is the 
most common objection made to the use of registered definitions. 

Weirdly, these same people see no problem with the notion that 
they are wall-to-wall Microsoft shops and are usually doing business 
with other companies who made the same choice.

o  Programmers who create their own parsers for XML without understanding 
that while they have the skill, we end up having to test the heck out of 
their code at considerable cost and for every version, whereas, 
if they would use a common service, we have not only our experience, but 
the aggregate experience of potentially millions of users to verify the 
component is well-behaved.  

I understand competitive relationships and tweaking for performance, but 
unfortunately, the reason is usually ego or unwillingness to understand 
the component. Human-In-The-Loop.  It will be interesting to see if 
the companies that continue to espouse complexity as a barrier to 
competition and performance differentiation will be able to stay 
in business.  As someone noted, the web is not about competition first; 
it is about sustaining alliances, then competition.

>This "functional unit" has known properties, but different representations,
>and local "tweaks". The user only needs to know the function within a
>particular interchange mechanism (business document) for a specific
process,
>not its internal representation.

Which if we go back to the CALS Wars, what was 38784 (the prime TM 
specification) became 28001 (1000 pages of DTD).  It works, but became 
very hard to apply.
  
The problem is upward aggregation into named monoliths which are then 
cited without concern for local performance.  We certainly need the testing 
at each level and must be able to decompose and take out a well-encapsulated

piece.  Both ends of the architecture are problematic.

Then we only have to cope with DLLHell. :-)

Len Bullard
Intergraph Public Safety
clbullar@ingr.com
http://www.mp3.com/LenBullard

Ekam sat.h, Vipraah bahudhaa vadanti.
Daamyata. Datta. Dayadhvam.h
Prev by Date: Re: RELAX to ISO
Next by Date: RE: RELAX to ISO
Previous by thread: Re: Compressability (WAS RE: XML Schemas: the wrong name)
Next by thread: Who will choose the winner? (was RE: XML Schemas: the wrong name)
Index(es):
- Date
- Thread