OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: RE: [xml-dev] XML=WAP? And DOA?

[ Lists Home | Date Index | Thread Index ]

1/14/2002 2:08:03 PM, Nicolas LEHUEN 
<nicolas.lehuen@ubicco.com> wrote:

I wouldn't diagree on the "self describing" bit; tags are 
just labels that have to refer to something else that defines 
their semantics.  My point vis a vis CSV was simply that a 
tag is a lot better than nothing (or a header somewhere far 
away) when you're debugging.

> Any given XML document requires a schema, and not only for
> validation....an XML application has to rely on an implicit
> or explicit schema to process XML documents meaningfully,
> i.e. at the semantic level, because it is the schema that
> creates the document semantics.
> Well-formedness alone is a lure ... If you don't write the
> schema explicitely, its ghost will appear in your programs
> anyway, created by the assumption the program has to make
> to run properly.

THIS is the kind of thing I had in mind when I referred to us 
talking past each other on xml-dev <grin>

I guess I disagree about the *general* applicability of the 
situation that Nicolas Lehuen describes.  Simon put it quite 
nicely (emphasis and parenthetical notes added) :
"...accept that information may not *always* come in 
precisely the same structure.  [when it doesn't] Write code 
which supports flexibility rather than demanding conformity. 
[you can] Throw away notions of strict conformance to 
semantical notions - rely only on syntactical conformance."  

In a loosely coupled application you may know very little 
about the data other than it is well-formed XML, and the job 
of an application component is to extract whatever 
information APPEARS to match the patterns it is looking for, 
put the information in a more useable form, and pass it down 
the pipeline for further processing.  A network of these 
simple components can do some quite interesting things, and 
tools such as Sean McGrath's XPipe stuff and Software AG's 
EntireX Orchestrator are becoming available to develop them. 

This is a very different way of looking at XML (and data 
processing for that matter) than the object-centric or 
schema-centric approach.  It solves one problem -- the lack 
of authoritative schema for many application domains -- by 
accepting a lot more chaos and error than many might find 
tolerable.  In any real system, there would have to be humans 
involved to make sure that that purchase order that looks 
like the deal of a lifetime is indeed what the pattern 
matcher thought it was and not a joke, a fraud, or something 
else entirely.  But at least they won't reject the purchase 
order of a lifetime because it had an extra <p> tag 
somewhere. <grin, yeah I know this is a contrived example!> 
More seriously, this is a way to exploit what order there is 
in the system, i.e., an <invoice> tag probably refers to 
something resembling an "invoice", without insisting on total 

This is not to say that this "loose" approach is the best; 
it's certainly not when you CAN authoritatively specify fixed 
schemas and reject messages/documents that don't match them.  
But it's better than handwringing about how XML can only be 
used once everyone agrees on a schema for some particular 
industry, as we see so often in the trade press.  


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS