xml-dev - Re: [xml-dev] Semantic Web permathread, iteration n+1

Re: [xml-dev] Semantic Web permathread, iteration n+1

[ Lists Home | Date Index | Thread Index ]

To: XML Developers List <xml-dev@lists.xml.org>
Subject: Re: [xml-dev] Semantic Web permathread, iteration n+1
From: Alexander Jerusalem <ajerulists@vknn.org>
Date: Fri, 04 Jun 2004 20:45:48 +0100
In-reply-to: <0E36FD96D96FCA4AA8E8F2D199320E520204D230@RED-MSG-43.redmond.corp.microsoft.com>
References: <0E36FD96D96FCA4AA8E8F2D199320E520204D230@RED-MSG-43.redmond.corp.microsoft.com>
User-agent: Mozilla Thunderbird 0.5 (Windows/20040207)

Although I'm becoming more and more skeptical of the overall aims of the 
Semantic Web, there are a few things, that I can't seem to solve in a 
satisfactory way using XML alone. Most of them have to do with XML's 
'mixup' of physical document structure (or the infoset if you like) and 
the assertion of facts on a conceptual level. For example, it may be 
convenient for some use cases to represent the relationship of a person 
to his publications like this:

<person id='123' lastName='...'>
    <pub title='abc'/>
    <pub .../>
    ...
</person>

Under different circumstances, it may be more useful to say:

<pub title='abc'>
    <author ref='123'/>
    ...
</pub>

If I use one or the other may depend on technical processing 
requirements like the granularity of a web services interface or 
efficient storage or whatever. Now, I hope this doesn't lead to a debate 
about normalisation again because that would be missing the point. What 
I'm saying is that the same facts may be represented differently for 
processing purposes (maybe only temporarily) and that within XML and XML 
Schema I have no way of expressing that they actually mean the same. If 
I wanted to query for all the publications of the person identified by 
123, I'd have to either transform all physical representations to a 
single one or formulate my query to account for all possible variants 
and maintain this mapping further down the road.

Of course one could argue that in RDF all the physical representations 
are transformed into some normalised graph form before processing as 
well. But the big difference is, that in most cases I don't have to 
write this transformation because it is either part of the RDF/XML to 
triples mapping rules anyway or it can be stated declaratively in OWL.

Interestingly, the XML Schema specification itself does make use of a 
more abstract notion of 'property' and a separate mapping of these 
properties to infoset contributions in it's definition of schema 
components but it doesn't afford the same power to it's users.

Nevertheless I am convinced, that it is a good thing, that order and 
hierarchy are significant in XML because they enable a very concise 
expression of containment and sequence. I think, however, that it should 
be possible for schema authors to define where order is insignificant, 
where reference by id is semantically equivalent to containment or which 
attributes are just semantically equivalent substitutes for child 
elements. Maybe even a few easily understandable OWL constructs like 
inverseOf.

As a result, a schema aware XPath processor, could allow me to say 
//person[firstname='...'] no matter if the fragment happens to be 
written as

<person firstname="..."/>
or
<person><firstname>...</firstname></person>

Summing things up, I would say XML's weak spot is it's schema layer 
otherwise even the lack of a uniform way of identifying and referencing 
things could be greatly alleviated. Has anybody heard of an OWL like 
schema language for XML?

-Alexander

References:
- RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
  - From: "Joshua Allen" <joshuaa@microsoft.com>

Prev by Date: RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
Next by Date: Re: [xml-dev] Bruce Perens and yet more patents
Previous by thread: RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
Next by thread: RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] Semantic Web permathread, iteration n+1
Index(es):
- Date
- Thread