xml-dev - Re: [xml-dev] Semantic equivalence of xml documents

Re: [xml-dev] Semantic equivalence of xml documents

[ Lists Home | Date Index | Thread Index ]

To: Alexander Johannesen <alexander.johannesen@gmail.com>,xml-dev@lists.xml.org
Subject: Re: [xml-dev] Semantic equivalence of xml documents
From: "Steven J. DeRose" <sderose@acm.org>
Date: Tue, 27 Sep 2005 13:21:30 -0400
In-reply-to: <f950954e050926155243502caa@mail.gmail.com>
References: <1127724593.6925.184.camel@kesch.itserve.ch> <20050926152047.94389.qmail@web32911.mail.mud.yahoo.com><f950954e050926155243502caa@mail.gmail.com>

At 08:52 +1000 2005-09-27, Alexander Johannesen wrote:
>You say you've got a schema defined for your data, and want to check
>semantic sameness for it with your customer (if I got you right). As
>with everything, it depends. One way *I* would do this is not to
>define my own schema, but instead use a general schema with semantics
>capabilites, create an ontology that speaks of my semantics, like with
>XTM Topic Maps, and then it's easy to create test-queries against
>those two maps to determine semantic equality. Heck, even a simple map
>merger would come up with enough stuff to determine most semantics.
>The same could also be done with RDF and an inferencing engine as
>well.
>
>As others have said, there's not much semantics defined in XML alone;
>you need the Semantic Web! *grin*

This is a very interesting idea. Can you help me understand how I 
might make this work in practice?

Take for example the case Yves originally raised: a <company> with 
multiple <person> elements. He wants to test whether two specific 
<company> elements have all the same children, but without worrying 
about their order. That seems like a pretty well-defined question 
(except for some details I mention below).

So let's say you convert or supplement Yves' data into XTM or RDF or 
XLink or something similar. What I don't see is how (using the 
semantics defined for any of the "semantic Web" devices) you would 
test the case he's got.

I see how to do this with XSLT, or a little JavaScript DOM, or such: 
for example, you can iterate over the children of company A, and look 
for each among the children (or bugle boys) of company B. Then do the 
same the other way. If all the searches succeed, the elements match.

Now, how do I do that in first-order predicate calculus (which is 
approximately the best that XTM or RDF can express, AFAIK)?

Lessee...

    For all P | parent(P)=Company1,
        Exists Q | parent(Q)=Company2 & string-matches(P,Q)

    (and the same again with the companies swapped)

So it should be doable in principle. But how might you actually 
express this in XTM or RDF?

Steve

------------

To keep things simple, I'm ignoring some real-world issues:

* Anything else hanging around within <company> (say, revision 
markup, footnotes, etc.)

* Variant forms and internal markup on person names

* Duplicate person names referring to different individuals

* Duplicate entries for the same individual (say, for different job roles)

* How you found which two <company> elements correspond in the first 
place. If re-ording is allowed at more than one level, picture how 
messy this gets....

* How XTM or any other predication language refers to locations without IDs

-- 
Luthien Consulting: Real solutions to hard information management problems
    Specializing in information design, XML, schemas, XSLT, and 
project design/review/repair
Steven J. DeRose, Ph.D., sderose@acm.org

References:
- Semantic equivalence of xml documents
  - From: Yves Langisch <lists@langisch.ch>
- Re: [xml-dev] Semantic equivalence of xml documents
  - From: Mukul Gandhi <mukul_gandhi@yahoo.com>
- Re: [xml-dev] Semantic equivalence of xml documents
  - From: Alexander Johannesen <alexander.johannesen@gmail.com>

Prev by Date: XLink, XInclude, and xml:base
Next by Date: Re: [xml-dev] Wrapping Scripted Media in RSS: Secure?
Previous by thread: Re: [xml-dev] Semantic equivalence of xml documents
Next by thread: Help Needed for XML FO & XPATH!
Index(es):
- Date
- Thread