[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
RE: [xml-dev] 3 approaches to structure lists, plus an analysisof each approach
- From: "Cox, Bruce" <Bruce.Cox@USPTO.GOV>
- To: "G. Ken Holman" <gkholman@CraneSoftwrights.com>,xml-dev@lists.xml.org
- Date: Thu, 19 Feb 2009 17:02:25 -0500
Thanks for the detailed information, Ken. It looks like the problem has
been thoroughly addressed and all I have to do is understand it. :O)
Bruce B Cox
Manager, Standards Development Division
USPTO/OCIO/SDMG
571-272-9004
-----Original Message-----
From: G. Ken Holman [mailto:gkholman@CraneSoftwrights.com]
Sent: Wednesday, February 18, 2009 8:33 PM
To: xml-dev@lists.xml.org
Subject: RE: [xml-dev] 3 approaches to structure lists, plus an analysis
of each approach
At 2009-02-18 19:47 -0500, Cox, Bruce wrote:
>Ken, does the approach you describe below address version control?
Absolutely. Each version of every list has unique list-level meta
data in the XML genericode file expressing the values of the list.
>There is the unfortunate potential for country codes, for example, to
be
>used differently at different times, as geopolitical boundaries change.
Indeed. Each code in every list can have value-level meta data
expressed in the genericode file, helping the reader understand the
semantics represented by each code.
>For a patent publication, we feel we need to know when the country code
>in question was in force, so the version of the list used is important.
It is important to cite *in the XML document* the instance-level meta
data identifying the list-level meta data of the list from which the
value in the instance was obtained. This gives the recipient the
value-level meta data to interpret the intended semantics of the value.
For example, the UN/CEFACT Core Component Technical Specification
(CCTS) 2.01 core component types define a number of facets of
instance-level meta data, called "supplementary components", that are
attributes attached to the element that contains the code as content
or the code as an attribute.
The XML instance author can choose to leave the instance-level meta
data empty, in which case the interpretation of the code is up to the
receiver ... for example, a currency value of "USD" is probably US
dollars. But in your example omitting instance-level meta data might
make interpretation ambiguous and imprecise. Specifying
instance-level meta data with the country code one would convey the
unambiguous values of list-level meta data of the code list from
which the code was derived, thus leading the recipient of the code to
inspect the value-level meta data associated with the code to
comprehend the semantics represented by the value used.
>Our situation is complicated by the fact that there are Offices issuing
>patent rights that are not associated with a country, but cover a
larger
>region, such as the European Patent Office.
Not a problem with the use of context/value association files
declaring that the values of a particular XML information item are
governed by the union of two genericode lists: one for the ISO
country codes (and their semantics for the values), and one for the
patent community's representation of regions (perhaps that's another
list).
>These institutions are
>given two-letter codes in WIPO Standard ST.3, which also incorporates
>ISO codes for all the member states' Offices. Yes, it duplicates ISO
>country codes, but only because the UN does not always recognize the
>changes in political boundaries *at the same time* that the ISO
>standards are updated, so WIPO has to have its own "politically
correct"
>list.
Oh, then I suppose you wouldn't accept the ISO semantics for the
values, so you don't need that union. You could simply point to the
single WIPO country code list, citing a particular version of that
list. But this underscores the importance of the instance-level meta
data to tell the recipient how to interpret a particular coded value
... to expand on an example from my book you might have:
<PatentFilingCountry listSchemeURI="urn:x-WIPO:Country Codes:1992"
>CS</PatentFilingCountry>
... representing Czechoslovakia, and
<PatentFilingCountry listSchemeURI="urn:x-WIPO:Country Codes:2007"
>CS</PatentFilingCountry>
... representing Serbia and Montenegro.
Without the instance-level meta data, just having the value "CS"
would be ambiguous.
So you want *both* lists (or as many lists) to apply to the same
element content, and this can be expressed in context/value
association as the union of two (or more) versions of the WIPO
list. With the appropriate genericode files the free Schematron
implementation of CVA validation on our web site would successfully
validate:
<PatentFilingCountry listSchemeUri="urn:x-WIPO:Country Codes:1992"
>FI</PatentFilingCountry>
... and:
<PatentFilingCountry listSchemeURI="urn:x-WIPO:Country Codes:1996"
>SF</PatentFilingCountry>
... while rejecting:
<PatentFilingCountry listSchemeURI="urn:x-WIPO:Country Codes:1996"
>FI</PatentFilingCountry>
... thus allowing only the single entry for Finland in each of two
lists with two different values based on the list versions.
There are strategies for omitting instance-level meta data should you
anticipate a value to be added to a list, say, six months from
now: create your instance without instance-level meta data, and
validate with the union of the published list and your custom
extension list with the future value. Later on when the new list is
published, the instance doesn't change but it will validate with the
new list, not unioned with your temporary transition extension that
has since evaporated. But there is a risk that the committee ends up
using a different value and the instance won't validate ... but at
least there was a migration strategy when the future information was
more certain.
When designing your XML vocabulary for the use of code lists and
identifiers, you have a responsibility to provide for instance-level
meta data. With a few minor exceptions, the UN/CEFACT supplementary
components for codes are expressed in the attributes:
listID=
listAgencyID=
listAgencyName=
listName=
listVersionID=
listURI=
listSchemeURI=
The UN/CEFACT supplementary components for identifiers are expressed
in the attributes:
schemeAgencyID=
schemeAgencyName=
schemeName=
schemeVersionID=
schemeDataURI=
schemeURI=
For those archive readers with a copy of our "Practical Code List
Implementation", this is detailed for UN/CEFACT core component types
on pages 48/49.
I'll be talking more about these concepts at XML Prague 2009
http://www.xmlprague.cz trying to convey the importance to designers
of XML vocabularies.
I hope this helps, Bruce.
. . . . . . . . . Ken
--
Upcoming hands-on XQuery, XSLT, UBL & code list training classes:
Brussels, BE 2009-03; Prague, CZ 2009-03, http://www.xmlprague.cz
Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video
Video lesson: http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18
Video overview: http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18
G. Ken Holman mailto:gkholman@CraneSoftwrights.com
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/x/
Male Cancer Awareness Nov'07 http://www.CraneSoftwrights.com/x/bc
Legal business disclaimers: http://www.CraneSoftwrights.com/legal
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]