[
Lists Home |
Date Index |
Thread Index
]
At 08:52 +1000 2005-09-27, Alexander Johannesen wrote:
>You say you've got a schema defined for your data, and want to check
>semantic sameness for it with your customer (if I got you right). As
>with everything, it depends. One way *I* would do this is not to
>define my own schema, but instead use a general schema with semantics
>capabilites, create an ontology that speaks of my semantics, like with
>XTM Topic Maps, and then it's easy to create test-queries against
>those two maps to determine semantic equality. Heck, even a simple map
>merger would come up with enough stuff to determine most semantics.
>The same could also be done with RDF and an inferencing engine as
>well.
>
>As others have said, there's not much semantics defined in XML alone;
>you need the Semantic Web! *grin*
This is a very interesting idea. Can you help me understand how I
might make this work in practice?
Take for example the case Yves originally raised: a <company> with
multiple <person> elements. He wants to test whether two specific
<company> elements have all the same children, but without worrying
about their order. That seems like a pretty well-defined question
(except for some details I mention below).
So let's say you convert or supplement Yves' data into XTM or RDF or
XLink or something similar. What I don't see is how (using the
semantics defined for any of the "semantic Web" devices) you would
test the case he's got.
I see how to do this with XSLT, or a little JavaScript DOM, or such:
for example, you can iterate over the children of company A, and look
for each among the children (or bugle boys) of company B. Then do the
same the other way. If all the searches succeed, the elements match.
Now, how do I do that in first-order predicate calculus (which is
approximately the best that XTM or RDF can express, AFAIK)?
Lessee...
For all P | parent(P)=Company1,
Exists Q | parent(Q)=Company2 & string-matches(P,Q)
(and the same again with the companies swapped)
So it should be doable in principle. But how might you actually
express this in XTM or RDF?
Steve
------------
To keep things simple, I'm ignoring some real-world issues:
* Anything else hanging around within <company> (say, revision
markup, footnotes, etc.)
* Variant forms and internal markup on person names
* Duplicate person names referring to different individuals
* Duplicate entries for the same individual (say, for different job roles)
* How you found which two <company> elements correspond in the first
place. If re-ording is allowed at more than one level, picture how
messy this gets....
* How XTM or any other predication language refers to locations without IDs
--
Luthien Consulting: Real solutions to hard information management problems
Specializing in information design, XML, schemas, XSLT, and
project design/review/repair
Steven J. DeRose, Ph.D., sderose@acm.org
|