OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   XML/XQuery Usage Assumptions (was Re: [xml-dev] XSLT 2.0 / XPath 2.0 -As

[ Lists Home | Date Index | Thread Index ]

5/13/2002 2:37:17 PM, Jonathan Robie <jonathan.robie@datadirect-technologies.com> wrote:

>That's fine as long as all the programs that touch the data know how it was 
>originally intended to be used and treat it correctly. Instead of using a 
>data type, you can now look at your DTD or Schema to see if someone left a 
>comment to tell you whether a date is in MMDDYYYY format or DDMMYYYY 
>format, and then look at all the programs that touch that data to make sure 
>that they interpreted the bytes correctly.

This illustrates a rather profound difference in our basic conceptions of what
XML is all about, I'll guess.  

<plug>For a change, I *am* more or less singing out of my 
employer's hymnbook, having spent much of last Tuesday helping write the
press release for our XML Mediator product at 
http://www.softwareagusa.com/news/releases/usa/2002/information.htm </plug>

SGML is all about imposing a "correct" document definition that both producers
and consumers of documents must agree to, or nothing useful happens.  XML
supports this conception, of course, with DTDs and various flavors of schemas, 
but does not insist on it. XML is also a very generic data meta-format, allowing
essentially any information to be encoded by a producer, and an interpretation
of the data to be discovered by a recipient.  The markup can be thought of as
hints from the producer as to what the information "means", not just a 
contract defining a shared understanding of the meaning.  

Sure, this is controversial and we debate it on xml-dev every couple 
of months or so, but consider the implications of the 
"no contract, no information exchange" 
position: The GAO report implying that the government should go slow on 
XML deployment until standard schemas are defined.  Or the various analyst/pundit
complaints that XML is a "Tower of Babel" that companies shouldn't take
too seriously until their industries define a standard XML format, or an 
elaborate schema repository  / discovery system where the "correct" 
interpretation of the data can be determined dynamically.  By this reasoning,
government and industry shouldn't have adopted fax machines until standards
for paper forms were produced!

This never happened in the world of paper and facsimiles of paper, and it will
never happen in the XML world either. [my prediction, I owe any current readers 
a beer if it ever happens!] In the memorable words of one of our marketing
people, "diversity is pervasive and persistent." XML,  XPath, XSLT,and  XQuery
allow relatively simple, non-AI software to 
approach the problem of  making sense out of diverse business documents
in much the same way as human clerks have for centuries -- DISCOVER relevant data
in the input (assisted by "common sense" and pattern recognition in the
case of humans, assisted by markup in the case of XML), then TRANSCRIBE them
into a form more amenable to further processing.

In this conception, the idea is not to validate that the data is being sent
"correctly", it's to recognize the data that is necessary to carry on the
business process.  Sure, some xsi:type attributes, or -- &deity; willing -- 
some agreed upon schema can help the cause, and XPath/XQuery should support that
when available.  But people have been figuring out whether "05/10" 
meant "May 10" or "5 October", or whether "387-66-9876" and 387669876 are
referring to the same person -- or at least "throwing an exception" 
and seeking clarification from a human -- for some time now.

The gist of what I hear from the field, from this list, and from
various consultants is that the potential customers for XML solutions
are already overwhelmed by complexity and want to manage it and 
minimize it ... integration strategies based on strongly typed procedural
code have proved to be expensive and brittle. People do NOT want to build 
new infrastructures to discover XML schemas that will also lead to 
brittle systems that will require massive coordination to keep running
in the face of schema diversity and evolution.  In short, the assumption
that producers and consumers of XML data will agree on the type of data
-- at the level of detail defined in the XSD types specification -- 
being exchanged, seems to describe an idealized corner case, not
the everyday reality of XQuery applications. If I'm right, the amount of
effort that XQuery spends to optimize this corner case, and insist that
implementers support it to be conformant, will turn out to be mis-spent.

End of rant ... of course, time will tell whose assumptions are closer
to eventual reality, but I think it's important to clarify the assumptions
on which my whining about the current state of XQuery is based. 


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS