[
Lists Home |
Date Index |
Thread Index
]
Although I'm becoming more and more skeptical of the overall aims of the
Semantic Web, there are a few things, that I can't seem to solve in a
satisfactory way using XML alone. Most of them have to do with XML's
'mixup' of physical document structure (or the infoset if you like) and
the assertion of facts on a conceptual level. For example, it may be
convenient for some use cases to represent the relationship of a person
to his publications like this:
<person id='123' lastName='...'>
<pub title='abc'/>
<pub .../>
...
</person>
Under different circumstances, it may be more useful to say:
<pub title='abc'>
<author ref='123'/>
...
</pub>
If I use one or the other may depend on technical processing
requirements like the granularity of a web services interface or
efficient storage or whatever. Now, I hope this doesn't lead to a debate
about normalisation again because that would be missing the point. What
I'm saying is that the same facts may be represented differently for
processing purposes (maybe only temporarily) and that within XML and XML
Schema I have no way of expressing that they actually mean the same. If
I wanted to query for all the publications of the person identified by
123, I'd have to either transform all physical representations to a
single one or formulate my query to account for all possible variants
and maintain this mapping further down the road.
Of course one could argue that in RDF all the physical representations
are transformed into some normalised graph form before processing as
well. But the big difference is, that in most cases I don't have to
write this transformation because it is either part of the RDF/XML to
triples mapping rules anyway or it can be stated declaratively in OWL.
Interestingly, the XML Schema specification itself does make use of a
more abstract notion of 'property' and a separate mapping of these
properties to infoset contributions in it's definition of schema
components but it doesn't afford the same power to it's users.
Nevertheless I am convinced, that it is a good thing, that order and
hierarchy are significant in XML because they enable a very concise
expression of containment and sequence. I think, however, that it should
be possible for schema authors to define where order is insignificant,
where reference by id is semantically equivalent to containment or which
attributes are just semantically equivalent substitutes for child
elements. Maybe even a few easily understandable OWL constructs like
inverseOf.
As a result, a schema aware XPath processor, could allow me to say
//person[firstname='...'] no matter if the fragment happens to be
written as
<person firstname="..."/>
or
<person><firstname>...</firstname></person>
Summing things up, I would say XML's weak spot is it's schema layer
otherwise even the lack of a uniform way of identifying and referencing
things could be greatly alleviated. Has anybody heard of an OWL like
schema language for XML?
-Alexander
|