Fwd: [xml-dev] Data versioning strategy: address semantic, relationship,

On 12/8/07, Costello, Roger L. < costello@mitre.org> wrote:

Hi Folks,

Oftentimes when discussing a "versioning strategy" I focus on how to
design schemas in a fashion to lessen the impact of changes.  It occurs
to me that this addresses only one aspect of the data versioning
problem.  Below I have attempted to identify other issues to be
addressed in a data versioning strategy.  I am interested in hearing
your thoughts on this.

EVOLVING DATA

Suppose some data is regularly exchanged between machines:

Machine 1 --> data --> Machine 2
Machine 1 <-- data <-- Machine 2

Periodically the data changes due to requirement changes, additional
insights, or from innovation.

A change results in a new "version" of the data.

PROBLEM

What are the categories of changes that may occur?  What categories of
changes must be dealt with by a data versioning strategy?

CATEGORIES OF CHANGE

1. Semantic - the meaning of the data changes.

Example:

version 1 data: a "distance" value means the distance from the center
of town.

version 2 data: a distance value means the distance from the town line.

2. Relationship - the relationship between the data changes.

Example:

version 1 data: there is a co-constraint between the start-time and the
end-time.

version 2 data: there is a three-way co-constraint between start-time,
end-time, and mode-of-transportation.

3. Syntax - the structure of the data changes.

Example:

version 1 data: the employee data is listed first and the person's name
is given by his given-name and surname.

version 2 data: the department data is listed first and in the employee
data each person's name additionally contains a middle name.

SUPPORTING TECHNOLOGIES

Suppose the data being exchanged is formatted using the XML syntax.

Machine 1 --> XML --> Machine 2
Machine 1 <-- XML <-- Machine 2

What technologies support the above categories of change?

1. Semantic: A data dictionary may be used to define meaning.

2. Relationship: Schematron may be used to express relationships
between data.

3. Syntax: XML Schema, Relax NG, or DTD may be used to express the
structure of the data.

REQUIREMENTS ON A VERSIONING STRATEGY

A versioning strategy must take into consideration:

- changes in the semantics of the data
- changes in the relationships of the data
- changes in the syntax of the data

When data is in an XML format then a versioning strategy must
implement:

- versioning a data dictionary
- versioning a Schematron schema
- versioning an XML Schema, Relax NG schema, or DTD

QUESTIONS

a. Do you agree with the three categories of change?

b. Do these categories represent all types of change?

c. Do you agree that a versioning strategy must address semantic,
relationship, and syntactic changes?

/Roger

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php