xml-dev - RE: [xml-dev] XML/XQuery Usage Assumptions (was Re: [xml-dev] XSLT 2.0 /

RE: [xml-dev] XML/XQuery Usage Assumptions (was Re: [xml-dev] XSLT 2.0 /

[ Lists Home | Date Index | Thread Index ]

To: 'Mike Champion' <mc@xegesis.org>, xml-dev <xml-dev@lists.xml.org>
Subject: RE: [xml-dev] XML/XQuery Usage Assumptions (was Re: [xml-dev] XSLT 2.0 / XPath 2.0 - Assumptions)
From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
Date: Mon, 13 May 2002 15:57:07 -0500

That's a good description of what you can do until you 
get a contract.  Once you have that, some overhead 
can go away.

If one ever did Mil-Std-38784 type processing, one 
remembers that the first six months of the project 
consisted of long phone calls, faxes, memos, and 
plane rides to get the right combination of the 
explosion of document information types down to 
the set to be delivered.   DTDs shortened the negotiation 
considerably, sometimes to a single phone call.  
That was SGML.  It cut the costs of sorting out 
the paper-based specification options and gave 
us a way to check on send and receive if these 
agreements were being followed.  Once followed, the 
DTD validation issue went away and added just a bit 
of time to the process.  However, if it woke up, 
we paid attention quick.   It wasn't too useful 
if the logies got their figures wrong because 
it was clueless about the values between the brackets.

Discovery-based systems are flexible and expensive. 
Fixed systems that have few options are sometimes not 
applicable, but easy to discover.   I don't want 
to play discovery with an order to give a patient 
shock therapy that can't figure out if the amperage 
to be used is expressed in integers or floating point. 
Executing at the terminal and terminal execution 
ain't the same process.

That said, the GAO article is a shot across the 
bow to the "frictionless" pundits in support of 
grease.   It should be, commit to XML, but stop 
and check the citations.   Anyone who doesn't have 
the good sense to read the DTDs or Schemas and 
check their own sources probably shouldn't be in 
the procurement business.  I like the article if 
it adds caution to the newbie's enthusiasm. I spend 
too much time having to explain these things to 
people anxious to market.

len

-----Original Message-----
From: Mike Champion [mailto:mc@xegesis.org]

This illustrates a rather profound difference in our basic conceptions of what
XML is all about, I'll guess.  

<plug>For a change, I *am* more or less singing out of my 
employer's hymnbook, having spent much of last Tuesday helping write the
press release for our XML Mediator product at 
http://www.softwareagusa.com/news/releases/usa/2002/information.htm </plug>

SGML is all about imposing a "correct" document definition that both producers
and consumers of documents must agree to, or nothing useful happens.  XML
supports this conception, of course, with DTDs and various flavors of schemas, 
but does not insist on it. XML is also a very generic data meta-format, allowing
essentially any information to be encoded by a producer, and an interpretation
of the data to be discovered by a recipient.  The markup can be thought of as
hints from the producer as to what the information "means", not just a 
contract defining a shared understanding of the meaning.  

Sure, this is controversial and we debate it on xml-dev every couple 
of months or so, but consider the implications of the 
"no contract, no information exchange" 
position: The GAO report implying that the government should go slow on 
XML deployment until standard schemas are defined.  Or the various analyst/pundit
complaints that XML is a "Tower of Babel" that companies shouldn't take
too seriously until their industries define a standard XML format, or an 
elaborate schema repository  / discovery system where the "correct" 
interpretation of the data can be determined dynamically.  By this reasoning,
government and industry shouldn't have adopted fax machines until standards
for paper forms were produced!

This never happened in the world of paper and facsimiles of paper, and it will
never happen in the XML world either. [my prediction, I owe any current readers 
a beer if it ever happens!] In the memorable words of one of our marketing
people, "diversity is pervasive and persistent." XML,  XPath, XSLT,and  XQuery
allow relatively simple, non-AI software to 
approach the problem of  making sense out of diverse business documents
in much the same way as human clerks have for centuries -- DISCOVER relevant data
in the input (assisted by "common sense" and pattern recognition in the
case of humans, assisted by markup in the case of XML), then TRANSCRIBE them
into a form more amenable to further processing.

In this conception, the idea is not to validate that the data is being sent
"correctly", it's to recognize the data that is necessary to carry on the
business process.  Sure, some xsi:type attributes, or -- &deity; willing -- 
some agreed upon schema can help the cause, and XPath/XQuery should support that
when available.  But people have been figuring out whether "05/10" 
meant "May 10" or "5 October", or whether "387-66-9876" and 387669876 are
referring to the same person -- or at least "throwing an exception" 
and seeking clarification from a human -- for some time now.

The gist of what I hear from the field, from this list, and from
various consultants is that the potential customers for XML solutions
are already overwhelmed by complexity and want to manage it and 
minimize it ... integration strategies based on strongly typed procedural
code have proved to be expensive and brittle. People do NOT want to build 
new infrastructures to discover XML schemas that will also lead to 
brittle systems that will require massive coordination to keep running
in the face of schema diversity and evolution.  In short, the assumption
that producers and consumers of XML data will agree on the type of data
-- at the level of detail defined in the XSD types specification -- 
being exchanged, seems to describe an idealized corner case, not
the everyday reality of XQuery applications. If I'm right, the amount of
effort that XQuery spends to optimize this corner case, and insist that
implementers support it to be conformant, will turn out to be mis-spent.

Prev by Date: XML/XQuery Usage Assumptions (was Re: [xml-dev] XSLT 2.0 / XPath 2.0 -Assumptions)
Next by Date: Re: RE: [xml-dev] XSLT 2.0 / XPath 2.0 - Assumptions
Previous by thread: Re: 64-bit Java
Next by thread: RE: [xml-dev] Re: datatype non-proliferation and disarmament
Index(es):
- Date
- Thread