XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Approaches to Expanding the Semantics of a Community's Self-Interested XML Vocabulary

Hi Folks,

I am documenting the different approaches for extending the semantics
of a community's tag-set. I seek your thoughts on this topic.  

Let me start with an example to illustrate what I mean by "extending
the semantics of a community's tag-set."

EXAMPLE

Community #1 has defined a set of tags for expressing a person's
contact information.  Here's an XML document that shows their XML
vocabulary:

<?xml version="1.0" encoding="UTF-8"?>
<Point-of-Contact>
    <Name>John Smith</Name>
    <Address>
        <Street>10 Tremont St.</Street>
        <City>Boston</City>
        <State>MA</State>
    </Address>
    <Telephone>617-123-4567</Telephone>
</Point-of-Contact>

Everyone in Community #1 understands the semantics of this collection
of tags, so within their community they merrily interoperate.

INTEROPERATING WITH OTHER COMMUNITIES

At some point in time, Community #1 recognizes that to grow and thrive
they must extend beyond their little island of members and must
interact with other communities.  Unfortunately for Community #1, those
other communities use different tags to represent a person's contact
information.  

Below are 3 approaches that Community #1 may take to bridge the gap
with the other communities.

1. OUT-OF-BAND SEMANTIC RESOLUTION

The first approach is for Community #1 to leave their XML documents
intact, as they are, and to bridge the gap by building a translator --
for example, an XSLT stylesheet that maps Community #1's tag-set to
Community #2's tag-set (and a translator to Community #3, #4, and so
forth)

Advantages

a. No impact to the XML documents exchanged within Community #1.

Disadvantages

a. Lots of translators need to be built and maintained ($$).

2. MIMIC THE HTML MODEL FOR EXTENDING SEMANTICS

The HTML specification says that the class attribute may be used for
"general user agent processing."[1] So, by adding class names to
elements, authors are able to expand the semantics of the language.

Let's see how Community #1 can exploit this idea of using class
attributes to extend the semantics of their elements.  Suppose that
Community #1 knows that some other communities use the vcard
specification[2] for representing a person's contact information.
Thus, Community #1 extends the semantics of their XML vocabulary as
follows:

<?xml version="1.0" encoding="UTF-8"?>
<Point-of-Contact class="vcard">
    <Name class="fn">John Smith</Name>
    <Address class="adr">
        <Street class="street-address">10 Tremont St.</Street>
        <City class="locality">Boston</City>
        <State class="region">MA</State>
    </Address>
    <Telephone class="tel">617-123-4567</Telephone>
</Point-of-Contact>

Note that a class attribute has been added to each element, and the
value of each class is a vcard term.

Now Community #1 can interoperate with any community that understands
vcards.  And, of course, within Community #1 they can simply ignore the
class attributes, since the semantics of the elements are already
understood.

The HTML specification also says: "Multiple class names must be
separated by white space characters." So, the class attributes can be
used in a polymorphic way to support other communities.  For example,
suppose some other communities use the EDI terminology for representing
a person's contact information.  Community #1 can accommodate those
communities as well:

<?xml version="1.0" encoding="UTF-8"?>
<Point-of-Contact class="vcard POC">
    <Name class="fn contact-name">John Smith</Name>
    <Address class="adr location">
        <Street class="street-address mailing-address">10 Tremont
St.</Street>
        <City class="locality district">Boston</City>
        <State class="region province">MA</State>
    </Address>
    <Telephone class="tel">617-123-4567</Telephone>
</Point-of-Contact>

Note that each class attribute now has two values: a vcard term and an
EDI term.

Now Community #1 can interoperate with any community that understands
vcards, as well as any community that understands EDI.  And, of course,
within Community #1 they still ignore the class attributes, since the
semantics of the elements are already understood.

Additional extensions can be made to the class attributes to support
other communities.

Advantages

a. The semantic extensions are embedded within the document (in-band);
no translators needed.

Disadvantages

a. Community #1 must extend their XML vocabulary to support the class
attribute on each element, and define its semantics similar to how HTML
defines it.

3. UNIVERSAL XML VOCABULARY

The third approach is for all the communities to get together, throw
out their existing tag-set, and get everyone to agree to use one,
standard, universal tag set.

Advantages

a. No interoperability problems

Disadvantages

a. Difficult to get disparate groups with their own self-interests to
forego their investments and agree to adopt a single, universal
tag-set.

QUESTIONS

I. Are there other approaches that aren't captured above?

II. Can you expand upon the advantages and disadvantages of the above
approaches? 

III. Which approach do you prefer? Why?

/Roger

[1] http://www.w3.org/TR/html401/struct/global.html#h-7.5.2
[2] http://www.imc.org/pdi/vcard-21.txt


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS