[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Approaches to Expanding the Semantics of a Community'sSelf-Interested XML Vocabulary
- From: Paul Tyson <phtyson@sbcglobal.net>
- To: "Costello, Roger L." <costello@mitre.org>
- Date: Wed, 21 Nov 2007 21:30:03 -0600
As Stephen Green mentioned in another reply, a promising approach goes
through RDF. I recently stumbled onto this approach while working on a
completely different problem, and I haven't worked out many details. I
am interested to hear what other people are doing in this area.
I focus on translating instances into RDF graphs. So far I've looked at
the Infoset RDF (http://www.w3.org/TR/xml-infoset-rdfs), a simple
late-bound vocabulary to describe the structure of an XML instance. A
trivial XSLT stylesheet will convert any XML instance to an RDF/XML
representation of its infoset RDF graph. Then you can query or analyze
it using RDF tools--for instance, a sparql query. You could even use the
sparql CONSTRUCT query to do simple transformations of subgraph patterns.
But, more to your use case. Suppose you translated instances from two
different communities into infoset RDF/XML. Then you add assertions to
state the conditions under which subgraphs from the different instances
will be considered "the same". Simple ontology reasoners could produce
merged output, or answer any questions you want to ask of the system.
This is really just a sophisticated translation system, but it occurs
more in the ontological layer than in the semantic and structural layers
of XML. And it seems like it would be more powerful because it can be
augmented with assertions from any source.
The reverse translation, from RDF to XML instance, would be only
slightly more challenging. A rough-and-ready approach would be to
process XML sparql results using XSLT. But a dedicated RDF application
would be better.
The late-bound infoset RDFS vocabulary will not be suitable for all
cases. It would be nice to have a standard early-bound infoset
vocabulary that used the XML schema terms as an RDFS vocabulary (with
prescribed transliteration where necessary). Of course, it is trivial
to convert between the two forms, but it seems best to have both forms
standardized so applications can be developed using whichever one is
most suitable.
--Paul Tyson
Costello, Roger L. wrote:
>Hi Folks,
>
>I am documenting the different approaches for extending the semantics
>of a community's tag-set. I seek your thoughts on this topic.
>
>Let me start with an example to illustrate what I mean by "extending
>the semantics of a community's tag-set."
>
>EXAMPLE
>
>Community #1 has defined a set of tags for expressing a person's
>contact information. Here's an XML document that shows their XML
>vocabulary:
>
><?xml version="1.0" encoding="UTF-8"?>
><Point-of-Contact>
> <Name>John Smith</Name>
> <Address>
> <Street>10 Tremont St.</Street>
> <City>Boston</City>
> <State>MA</State>
> </Address>
> <Telephone>617-123-4567</Telephone>
></Point-of-Contact>
>
>Everyone in Community #1 understands the semantics of this collection
>of tags, so within their community they merrily interoperate.
>
>INTEROPERATING WITH OTHER COMMUNITIES
>
>At some point in time, Community #1 recognizes that to grow and thrive
>they must extend beyond their little island of members and must
>interact with other communities. Unfortunately for Community #1, those
>other communities use different tags to represent a person's contact
>information.
>
>Below are 3 approaches that Community #1 may take to bridge the gap
>with the other communities.
>
>1. OUT-OF-BAND SEMANTIC RESOLUTION
>
>The first approach is for Community #1 to leave their XML documents
>intact, as they are, and to bridge the gap by building a translator --
>for example, an XSLT stylesheet that maps Community #1's tag-set to
>Community #2's tag-set (and a translator to Community #3, #4, and so
>forth)
>
>Advantages
>
>a. No impact to the XML documents exchanged within Community #1.
>
>Disadvantages
>
>a. Lots of translators need to be built and maintained ($$).
>
>2. MIMIC THE HTML MODEL FOR EXTENDING SEMANTICS
>
>The HTML specification says that the class attribute may be used for
>"general user agent processing."[1] So, by adding class names to
>elements, authors are able to expand the semantics of the language.
>
>Let's see how Community #1 can exploit this idea of using class
>attributes to extend the semantics of their elements. Suppose that
>Community #1 knows that some other communities use the vcard
>specification[2] for representing a person's contact information.
>Thus, Community #1 extends the semantics of their XML vocabulary as
>follows:
>
><?xml version="1.0" encoding="UTF-8"?>
><Point-of-Contact class="vcard">
> <Name class="fn">John Smith</Name>
> <Address class="adr">
> <Street class="street-address">10 Tremont St.</Street>
> <City class="locality">Boston</City>
> <State class="region">MA</State>
> </Address>
> <Telephone class="tel">617-123-4567</Telephone>
></Point-of-Contact>
>
>Note that a class attribute has been added to each element, and the
>value of each class is a vcard term.
>
>Now Community #1 can interoperate with any community that understands
>vcards. And, of course, within Community #1 they can simply ignore the
>class attributes, since the semantics of the elements are already
>understood.
>
>The HTML specification also says: "Multiple class names must be
>separated by white space characters." So, the class attributes can be
>used in a polymorphic way to support other communities. For example,
>suppose some other communities use the EDI terminology for representing
>a person's contact information. Community #1 can accommodate those
>communities as well:
>
><?xml version="1.0" encoding="UTF-8"?>
><Point-of-Contact class="vcard POC">
> <Name class="fn contact-name">John Smith</Name>
> <Address class="adr location">
> <Street class="street-address mailing-address">10 Tremont
>St.</Street>
> <City class="locality district">Boston</City>
> <State class="region province">MA</State>
> </Address>
> <Telephone class="tel">617-123-4567</Telephone>
></Point-of-Contact>
>
>Note that each class attribute now has two values: a vcard term and an
>EDI term.
>
>Now Community #1 can interoperate with any community that understands
>vcards, as well as any community that understands EDI. And, of course,
>within Community #1 they still ignore the class attributes, since the
>semantics of the elements are already understood.
>
>Additional extensions can be made to the class attributes to support
>other communities.
>
>Advantages
>
>a. The semantic extensions are embedded within the document (in-band);
>no translators needed.
>
>Disadvantages
>
>a. Community #1 must extend their XML vocabulary to support the class
>attribute on each element, and define its semantics similar to how HTML
>defines it.
>
>3. UNIVERSAL XML VOCABULARY
>
>The third approach is for all the communities to get together, throw
>out their existing tag-set, and get everyone to agree to use one,
>standard, universal tag set.
>
>Advantages
>
>a. No interoperability problems
>
>Disadvantages
>
>a. Difficult to get disparate groups with their own self-interests to
>forego their investments and agree to adopt a single, universal
>tag-set.
>
>QUESTIONS
>
>I. Are there other approaches that aren't captured above?
>
>II. Can you expand upon the advantages and disadvantages of the above
>approaches?
>
>III. Which approach do you prefer? Why?
>
>/Roger
>
>[1] http://www.w3.org/TR/html401/struct/global.html#h-7.5.2
>[2] http://www.imc.org/pdi/vcard-21.txt
>
>_______________________________________________________________________
>
>XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>to support XML implementation and development. To minimize
>spam in the archives, you must subscribe before posting.
>
>[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
>subscribe: xml-dev-subscribe@lists.xml.org
>List archive: http://lists.xml.org/archives/xml-dev/
>List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>
>
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]