Assume
four datasets: an XML document, a JSON document, a CSV file and an HTML
document (authored near the north pole, in the rain forest, in Athens
and in the Antarctic, respectively).
Imagine a standard which enables you to define the mapping of a document node to a set of RDF triples.
Remember that all documents (XML, JSON, CSV, HTML) can be parsed into document nodes (for example see [1]).
Assume that the RDF graphs obtained from our documents contain the following triples:
foo:oxygen foo:symbol "O"
foo:oxygen foo:numberOfElectrons "8"
foo:oxygen foo:atomic mass "16"
foo:oxygen foo:electronegativity ."3.5"
each one found in a different one of the four RDF graphs.
Then
we have integrated information, as we now know four things about
oxygen, contributed by different data sources using a different data
format. Of course it would be easy to serialize the integrated
information into XML, or JSON, or CSV, or HTML or any other format
(employing Inuit or any other natural language).
+ + + - - -
But
I suppose you think this is an idle dream. Perhaps you think that the
imagined standard would not be feasible to create or to use, or you
question the practicality to leverage RDF IRIs for identifying resources
and properties in more than a few specific cases.
Unfortunately
I agree that it is an idle dream. Only the reason I see is a different
one, as I am convinced that the imagined standard is not too difficult
to create and to use and I do not question the practicality of using RDF
IRIs in many fields, including natural science, pharmacology, health
care, finance, many verticals and economical interaction. The reason I
see is that it seems impossible to find minds with a deep interest in
both, XML technology and semantic technology. if - then - but.
With kind regards,
Hans-Jürgen Rennau
[1]