xml-dev - RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web pe

RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web pe

[ Lists Home | Date Index | Thread Index ]

To: 'Elliotte Rusty Harold' <elharo@metalab.unc.edu>, XML Developer List <xml-dev@lists.xml.org>
Subject: RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
From: "Bullard, Claude L (Len)" <len.bullard@intergraph.com>
Date: Tue, 8 Jun 2004 09:15:30 -0500

That's good.  Except the bit about costs going up.  Why 
would they?

A schema can be a guardian or a classification verifier. 
One might assume, rightly or wrongly that the MIME type or 
the extension or a magic number or a DOCTYPE tells 
one the class.  One might have to verify that.  

Also, this application of a schema is only one of several possible.   

I agree that forcing widespread use of a schema is a tough political 
problem but it is a trivial technical issue.  One says, "this is 
the Internet, after all" but one means "these are humans 
after all".   Humans often fail Turing tests.

len

The trick to passing a Turing test is selecting the topic 
of conversation wisely.  Eagerness is everything.

From: Elliotte Rusty Harold [mailto:elharo@metalab.unc.edu]

>On Jun 8, 2004, at 12:31 AM, Rick Marshall wrote:
>
>>  and if the schema changes, but not the xslt, and someone suffers 
>>financial loss - tax returns fail, orders lost, etc - who pays?

Perhaps there's a technical step in the proposed system you're 
missing here. When receiving a document you first have to classify 
it. That is, you must figure out if this is a kind of document you've 
seen before, and if you have tools in place to process it 
automatically. If you do, then dispatch it to one of those tools. If 
not, dispatch it to a human for further analysis.

We can adjust how tight we make the recognition software. Personally, 
I like loose, XPath based solutions like Schematron that ask whether 
the document contains the information I want rather than asking 
whether it tightly fits some W3C XML Schema Language schema. However, 
if you want to use a conservative schema (everything not permitted is 
forbidden) as your diagnosis, go ahead. You won't be able to process 
quite as much automatically, and costs will go up; but maybe in your 
environment and for your processes safety concerns do mandate that. 
We can also have a middle ground, where XPath extracts the relevant 
fragments of a document, and then each of these fragments we use is 
validated closely without worrying about the outer envelope. And 
there are lots of other points along the continuum as well.

However, the really key idea is to use the schema, in whatever 
language, as a classification tool, not a guardian. The schema's job 
is to sort documents into the right queue, not to accept some 
documents unconditionally and reject all others.

Follow-Ups:
- RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
  - From: Elliotte Rusty Harold <elharo@metalab.unc.edu>

Prev by Date: Re: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
Next by Date: RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
Previous by thread: RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
Next by thread: RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
Index(es):
- Date
- Thread