XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] 3 approaches to structure lists, plus an analysis of each approach

At 2009-02-14 17:40 -0500, Costello, Roger L. wrote:
>What are the different approaches to structure lists?

A.k.a. "controlled vocabularies"

>What are the pros and cons of each approach? Is there a way to 
>structure lists to maximize their utility and minimize their overhead?

Yes!

>The purpose of this message is to document and analyze several 
>approaches to structure lists. I use "country list" to illustrate 
>the different approaches.
>
>ASSERTION: LISTS THAT CAN BE USED FOR MULTIPLE PURPOSES ARE GOOD

There are three aspects of controlled vocabularies of interest to XML 
documents:

  - list-level meta data - identifies the list as an entity
http://www.balisage.net/Proceedings/html/2008/Holman01/Balisage2008-Holman01.html#codes
  - value-level meta data - augments values with "meaning" by adding 
information
http://www.balisage.net/Proceedings/html/2008/Holman01/Balisage2008-Holman01.html#codesmd
  - instance-level meta data - used in XML documents to associate 
specified controlled vocabulary values with the particular list-level 
meta data from which the value is obtained, thus disambiguating 
values for applications when ambiguous in an element whose values are 
from the union of two lists with overlapping values
http://www.balisage.net/Proceedings/html/2008/Holman01/Balisage2008-Holman01.html#codesilm

Here is an excerpt of OASIS genericode 1.0 file for ISO country codes:

http://docs.oasis-open.org/ubl/os-UBL-2.0-update/cl/gc/default/CountryIdentificationCode-2.0.gc

<?xml version="1.0" encoding="UTF-8"?>
<gc:CodeList xmlns:gc="http://docs.oasis-open.org/codelist/ns/genericode/1.0/";>
    <Identification>
       <ShortName>CountryIdentificationCode</ShortName>
       <LongName xml:lang="en">Country</LongName>
       <LongName Identifier="listID">ISO3166-1</LongName>
       <Version>0.3</Version>
       <CanonicalUri>urn:oasis:names:specification:ubl:codelist:gc:CountryIdentificationCode</CanonicalUri>
       <CanonicalVersionUri>urn:oasis:names:specification:ubl:codelist:gc:CountryIdentificationCode-2.0-update</CanonicalVersionUri>
       <LocationUri>http://docs.oasis-open.org/ubl/os-UBL-2.0-update/cl/gc/default/CountryIdentificationCode-2.0.gc</LocationUri>
       <Agency>
          <LongName xml:lang="en">United Nations Economic Commission 
for Europe</LongName>
          <Identifier>6</Identifier>
       </Agency>
    </Identification>
    <ColumnSet>
       <Column Id="code" Use="required">
          <ShortName>Code</ShortName>
          <Data Type="normalizedString"/>
       </Column>
       <Column Id="name" Use="optional">
          <ShortName>Name</ShortName>
          <Data Type="string"/>
       </Column>
       <Column Id="numericcode" Use="optional">
          <ShortName>NumericCode</ShortName>
          <Data Type="string"/>
       </Column>
       <Key Id="codeKey">
          <ShortName>CodeKey</ShortName>
          <ColumnRef Ref="code"/>
       </Key>
    </ColumnSet>
    <SimpleCodeList>
       <Row>
          <Value ColumnRef="code">
             <SimpleValue>AF</SimpleValue>
          </Value>
          <Value ColumnRef="name">
             <SimpleValue>AFGHANISTAN</SimpleValue>
          </Value>
          <Value ColumnRef="numericcode">
             <SimpleValue>004</SimpleValue>
          </Value>
       </Row>
       <Row>
          <Value ColumnRef="code">
             <SimpleValue>AL</SimpleValue>
          </Value>
          <Value ColumnRef="name">
             <SimpleValue>ALBANIA</SimpleValue>
          </Value>
          <Value ColumnRef="numericcode">
             <SimpleValue>008</SimpleValue>
          </Value>
       </Row>
       <Row>
          <Value ColumnRef="code">
             <SimpleValue>DZ</SimpleValue>
          </Value>
          <Value ColumnRef="name">
             <SimpleValue>ALGERIA</SimpleValue>
          </Value>
          <Value ColumnRef="numericcode">
             <SimpleValue>012</SimpleValue>
          </Value>
       </Row>
       ...
    </SimpleCodeList>
<gc:CodeList>

An OASIS context/value association file (CVA file) is used to 
describe which document contexts are associated with which genericode 
files of controlled vocabularies.  A diagram is here:

http://www.balisage.net/Proceedings/html/2008/Holman01/Balisage2008-Holman01.html#cva

An excerpt of a UBL CVA file associated the country identifier with 
the country code list is:

<?xml version="1.0" encoding="utf-8"?>
<ValueListConstraints 
xmlns="http://docs.oasis-open.org/codelist/ns/ContextValueAssociation/cd2-1.0/"; 
xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" 
xmlns:ext="urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2">
    <ValueLists>
       <ValueList xml:id="AccountingCostCode" 
uri="../../cl/gc/default/AccountingCostCode-2.0.gc"/>
       ...
       <ValueList xml:id="CountryIdentificationCode" 
uri="../../cl/gc/default/CountryIdentificationCode-2.0.gc"/>
       ...
       <ValueList xml:id="UNDGCode" uri="../../cl/gc/default/UNDGCode-2.0.gc"/>
    </ValueLists>
    <Contexts>
       <Context item="cbc:AccountingCostCode" values="AccountingCostCode"/>
       ...
       <Context item="cbc:IdentificationCode" 
values="CountryIdentificationCode"/>
      ...
       <Context item="cbc:UNDGCode" values="UNDGCode"/>
    </Contexts>
</ValueListConstraints>

>Lists should be structured in a way that they can be used for 
>multiple purposes. For example, a country list may be:
>
>     - used as values in an XForms pick list.
>
>     - transformed into a document that contains, for each country,
>       sales figures (or death rates, births, political leadership,
>       religions, etc).
>
>     - used to validate an element's content, e.g. The value of the
>       <country-visited> element must be a country.
>
>Those are only a few of the myriad uses of a country list. A 
>well-designed country list should support all of them.

Specifications and implementations to do this are found here:

   genericode 1.0 - lists of codes with list-level and code-level meta data

      http://docs.oasis-open.org/codelist/genericode

   context/value association using genericode 0.5 draft 1
                  - contextual code list usage and instance-level meta data

      http://www.oasis-open.org/committees/document.php?document_id=29990

   There is a Schematron-based implementation of validation using CVA files
   available from Crane's web site and being donated to the Schematron
   project:

      http://www.CraneSoftwrights.com/resources/ubl/index.htm#cva2sch

As cited by Mike, the code list committee is here:

   http://www.oasis-open.org/committees/codelist

I hope this helps.

. . . . . . . . . . . Ken

--
Upcoming hands-on XSLT, UBL & code list hands-on training classes:
Brussels, BE 2009-03;  Prague, CZ 2009-03, http://www.xmlprague.cz
Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video
Video lesson:    http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18
Video overview:  http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18
G. Ken Holman                 mailto:gkholman@CraneSoftwrights.com
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/x/
Male Cancer Awareness Nov'07  http://www.CraneSoftwrights.com/x/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS