xml-dev - RE: [xml-dev] A multi-step approach on defining object-oriented nature

RE: [xml-dev] A multi-step approach on defining object-oriented nature
[ Lists Home | Date Index | Thread Index ]
To: "'W. E. Perry'" <wperry@fiduciary.com>, XML DEV <xml-dev@lists.xml.org>
Subject: RE: [xml-dev] A multi-step approach on defining object-oriented nature of DOM
From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
Date: Thu, 22 Aug 2002 10:31:26 -0500
I think we are running a rabbit trail here, but one 
reply, then back to work on this code that someone 
else wrote but I have to understand before I adapt 
it, so thanking the developer for commented code.

From: W. E. Perry [mailto:wperry@fiduciary.com]

"Bullard, Claude L (Len)" wrote:

>> Discovery by output has the aggravation that the output may not be regular from instance
>> to instance.

>Which, if it is the case, is already a feature which the consuming application has to deal
>with anyway. 

Yes but if it is regular within the range of what the schema describes, it can be handled. 
That is why we have transforms.  On the other hand, having to discover an irregularity at 
runtime leaves me dead in the water or flopping on the surface waiting to asphyxiate.

>And, yes, this does happen in the real world (it is true in some months of as
>much as 20% of the data which my applications must consume) and it is a problem utterly
>incapable of solution by schema, or in the general case by any fixed a priori agreements.

Ok, what kinds of errors are those?

> Or one could say that one should be looking at the output of another process that defines
> the output of the others.

>Why add this layer of indirection when we already understand that we are not relying on the
>authority of a priori agreements, and a third party which merely makes empirical
>observation of the behavior of processes is not doing anything which an XML-consuming
>application could not, and should not, do for itself.

Hmm?  We may be relying on apriori aggreements.  I have yet to see an XML-consuming 
application that can make sense of a printed form without human analysis.  Somewhere 
in the pipeline, someone sat down and made a data model.  It may be that its expression 
is in the code, but keeping it there and not documenting it is the bad old days of 
stovepiped proprietary apps come back to life.

That is quite different from having a data dictionary for the message type information, 
even if on receipt and recognition, it gets transformed by some means into the types 
required by the local processor.  That is just basic loose coupling.

>> This is something like the process of analyzing print documents to create schemata for
>> them.  One has to be sure the set is exhaustive or one comes back to rework.

>Perhaps so, if we were analyzing them in order to derive generalized schemata, but I
>propose that this is emphatically not what we are doing. 

Umm.. what is UBL then?

>Instead, the XML-consuming process
>is examining an instance document to determine whether in that particular instance, or in
>some other document which could be located and fetched based on the information in that
>instance, is there what the XML-consuming application requires to instantiate some
>particular bit of data which it requires for one instance execution of its processing.

Ok.  Sure.  What is in that other document that is located and fetched?  I agree that 
for WFed transactions, it can (if a DOCTYPE or schemaLocation or hey, what is that 
namespace URI for) go ask Schrodinger's Cat and avoid a lot of problems.  But analysis 
by inspection to find errors past WFness is hairy stuff.   I don't want the network 
buggered by a lots of little applications corresponding to debug a file.  Seems wasteful. 
My claim is that most transactions go in one direction and that things are more 
efficient if the contents can be configured a priori.  Save time; saves bandwidth.

>  As said in the past, even when schemata aren't used in the communication pipeline, in
> the design pipeline, they can be invaluable.  If as an output of that process, one gets a
> definition which is exhaustive, that is, any transaction will be valid under it, then it
> is better to read the schema.

>The design of an application is in large part the design of its appropriate internal data
>structure. 

True, but is the internal structure what we send down the pipes.  It may be heavily optimized 
for local efficiency, not general representations.   I don't send ints down the pipe.  I 
don't send my name down the pipe; I might send <int>1</int> or <myChristianName>Claude</myChristianName>. 
UBL won't like that last one, but I do, so a UBL processor needs a translator and it may 
not want to invoke AI or topic mapping to discover that.  Different strokes.

>If we are talking about the schema for that structure, then that is emphatically
>not the schema which might be considered to govern the form of document interchange between
>processes and thereby to enforce a 'contract' at the input or invocation interface side of
>an application. 

It absolutely doesn't govern your internal structure.  It governs what is sent so we 
don't all have to use your structure before we can send a file.  We govern the output 
true.

>Indeed, my primary point is that we cannot rely on any ostensible
>input-side schema, even when all observed input instances conform to it, if our design goal
>to to build an application predicated on the best implementation of its own expertise,
>including a necessary expertise in input data collection and instantiation.

An input side schema is a boon to the author.  Even there, the system may be transforming 
it into an internal structure (in fact, always is).  If the input schema is the governing 
document, it is transforming it coming out, in fact, always is.  XML is a convenient 
means to serialize then work out amongst partners how to use the rest of the framework 
to optimize the business/enterprise/whatever processes.   It is useful to point to a 
schema in an RFP, and quite useless to do as some RFPs do and simply say, "XML Required".

> I say where namespaces are used, there should be a document somewhere but I prefer such
> documents for any XML output.
> If the http: URI is used, it should, in good conscience, point to a document should
> inspection ever be required.

>Why? What can it possibly give you except an accurate description of an instance which you
>could get directly? 

It can have human readable descriptions.  It can tell me all the variations to expect. 
It can enable me to quickly work up a debugging test rather than pick up the telephone 
and try to track down the author, who also may be out-of-date, but oh well.

>And you run the considerable risk that the schematic is out-of-date or
>simply doesn't describe an exceptional instance which is the thing of interest to you.

Ah, but that is exactly why I want it.  It can tell me where conformance fails and yes, 
it may be that I haven't received the last memo about version changes.  It's a start.

>> Sure, there are XML instances without very long lifecycles or when opened, are glaringly
>> obvious as to what is intended.  Every situation has exceptions. To me, the spec is fine
>> as is by not requiring such, but we are better off by the practice.  Humans still have
>> some decisions to make.

>Indeed they do, but in my opinion they should be basing those decisions on the data
>instances which they have actually got rather than upon some authority which purports to
>govern such instances. (This gulf between revelatio and auctoritas is certainly nothing
>new).

I accept authority if it is doing something useful for me; I accept 
revelation if it teaches me something new.  The governing document 
is quite useful if the instance says reveals something to me 
that is not new or useful.

len
Prev by Date: Re: [xml-dev] Namespaces blur (A multi-step approach on defining object-orientednature of DOM)
Next by Date: Re: [xml-dev] A multi-step approach on defining object-oriented
Previous by thread: RE: [xml-dev] A multi-step approach on defining object-oriented n
Next by thread: More AI Reading
Index(es):
- Date
- Thread