XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Caution using XML Schema backward- or forward-compatibility as a versioning strategy for data exchange

Roger,

I think in an earlier thread David Orchard contributed some comments.
I can't remember whether he included EXTENSIBILITY as a [special] type
of versioning. In particular the ideas around who 'owns' the
vocabulary and who can make changes. One of the things that I find
very difficult is where there is only central ownership and no ability
for distributed extensibility. Central ownership has very many plus
points, but at least one significant negative, namely, the speed of
change. In particualr information items that are [at least initially]
part of a private relationship between two (or more) trading partners,
but where the vast majority of the exchange is fulfilled by a
'standard' schema. IMO extensibility (for the vocabulary user) is
essential and hels to reduce versioning 'churn' and more importantly
ensures that the core vocabulary does not constrain the business
operating model of those that want to use it.

Fraser.

On 27/12/2007, Costello, Roger L. <costello@mitre.org> wrote:
> Excellent discussion!
>
> Michael has brought into the discussion a very useful idea: semantic
> drift.  He asserts that it "happens naturally in the real world".
>
> I assert that it also occurs naturally and often in data versioning.
>
> Here are two examples of semantic drift in data versioning:
>
> EXAMPLE #1
>
> Consider this simple XML document:
>
>    <distance>100</distance>
>
> In the v1 XML Schema the <distance> element is declared as follows:
>
>    <element name="distance" type="nonNegativeInteger"/>
>
> The data specification document defines distance as:
>
>    "Distance represents the length measurement from center of town."
>
> In the v2 XML Schema there is no change to the declaration of the
> <distance> element:
>
>    <element name="distance" type="nonNegativeInteger"/>
>
> However, the data specification document redefines distance:
>
>    "Distance represents the length measurement from the town line."
>
> The we have an example of two versions that are "validation-compatible"
> but "semantic incompatible."
>
> The semantics of "distance" has drifted from v1 to v2.
>
> EXAMPLE #2
>
> Consider the same simple XML document:
>
>    <distance>100</distance>
>
> In the v1 XML Schema it is declared differently:
>
>    <element name="distance">
>        <complexType>
>            <simpleContent>
>                <extension base="nonNegativeInteger">
>                    <attribute name="units" fixed="miles"/>
>                </extension>
>            </simpleContent>
>        </complexType>
>    </element>
>
> The <distance> element now has a "units" attribute which is fixed at
> "miles."
>
> The data specification document defines distance as:
>
>    "Distance represents the length measurement from center of town."
>
> In the v2 XML Schema the declaration of the <distance> element is
> modified; the units attribute is fixed at "kilometers":
>
>    <element name="distance">
>        <complexType>
>            <simpleContent>
>                <extension base="nonNegativeInteger">
>                    <attribute name="units" fixed="kilometers"/>
>                </extension>
>            </simpleContent>
>        </complexType>
>    </element>
>
> The data specification document is unchanged in its definition of
> distance:
>
>    "Distance represents the length measurement from center of town."
>
> Thus, we see a second example of two versions that are
> "validation-compatible" but "semantic incompatible."
>
> The semantics of "distance" has drifted from v1 to v2.
>
> COMMENTS
>
> 1. I think that these examples illustrate two common changes in data.
> Do you agree?
>
> 2. In the examples, the XML instance document:
>
>    <distance>100</distance>
>
> validates fine against both the v1 and v2 XML Schemas.  But if the
> applications that process the XML instance aren't changed, then the
> processing results may be incorrect.
>
> CAUTION
>
> Just because an application can validate an XML instance document,
> doesn't mean it can process the XML instance document.
>
> QUESTION
>
> Can you state in one sentence the fundamental lesson to be learned in
> our discussion?
>
> /Roger
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: Michael Kay [mailto:mike@saxonica.com]
> Sent: Thursday, December 27, 2007 6:13 AM
> To: 'Stephen Green'; Costello, Roger L.; xml-dev@lists.xml.org
> Subject: RE: [xml-dev] Caution using XML Schema backward- or
> forward-compatibility as a versioning strategy for data exchange
>
> > e.g. because an element wasn't made optional it
> > cannot be removed and so there is a temptation to change its
> > semantics - to reuse it for something else rather than remove
> > it.
>
> Yes, "semantic drift" is a big problem and of course it happens even in
> the
> absence of schema change.
>
> Semantic drift happens naturally in the real world, for example credit
> card
> numbers which once identified an account might start to identify a
> specific
> card with access to that account. It's not surprising that it happens,
> because if a system is capable of meeting new requirements without
> requiring
> any software changes then people will use it creatively in new ways to
> meet
> those requirements. One of the challenges in designing schemas (or
> database
> integrity constraints) is knowing whether you should try to resist
> semantic
> drift as a menace to information integrity, or whether you should allow
> your
> system to ride the waves, thus increasing its flexibility and
> longevity.
>
> System designers often underestimate the creativity of users in
> applying
> semantic overloading to data structures. I saw one system where users
> were
> marking certain records for review the following day, simply by
> entering a
> particular code that was known to be invalid and would therefore appear
> in
> tomorrow's validation report. The system designers helpfully introduced
> stronger validation at data-entry time, and chaos ensued because the
> users
> had to invent a new process.
>
> Michael Kay
> http://www.saxonica.com/
>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS