[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Data versioning strategy: address semantic, relationship, and syntactic changes?
- From: "Costello, Roger L." <costello@mitre.org>
- To: <xml-dev@lists.xml.org>
- Date: Fri, 7 Dec 2007 15:55:15 -0500
Hi Folks,
Oftentimes when discussing a "versioning strategy" I focus on how to
design schemas in a fashion to lessen the impact of changes. It occurs
to me that this addresses only one aspect of the data versioning
problem. Below I have attempted to identify other issues to be
addressed in a data versioning strategy. I am interested in hearing
your thoughts on this.
EVOLVING DATA
Suppose some data is regularly exchanged between machines:
Machine 1 --> data --> Machine 2
Machine 1 <-- data <-- Machine 2
Periodically the data changes due to requirement changes, additional
insights, or from innovation.
A change results in a new "version" of the data.
PROBLEM
What are the categories of changes that may occur? What categories of
changes must be dealt with by a data versioning strategy?
CATEGORIES OF CHANGE
1. Semantic - the meaning of the data changes.
Example:
version 1 data: a "distance" value means the distance from the center
of town.
version 2 data: a distance value means the distance from the town line.
2. Relationship - the relationship between the data changes.
Example:
version 1 data: there is a co-constraint between the start-time and the
end-time.
version 2 data: there is a three-way co-constraint between start-time,
end-time, and mode-of-transportation.
3. Syntax - the structure of the data changes.
Example:
version 1 data: the employee data is listed first and the person's name
is given by his given-name and surname.
version 2 data: the department data is listed first and in the employee
data each person's name additionally contains a middle name.
SUPPORTING TECHNOLOGIES
Suppose the data being exchanged is formatted using the XML syntax.
Machine 1 --> XML --> Machine 2
Machine 1 <-- XML <-- Machine 2
What technologies support the above categories of change?
1. Semantic: A data dictionary may be used to define meaning.
2. Relationship: Schematron may be used to express relationships
between data.
3. Syntax: XML Schema, Relax NG, or DTD may be used to express the
structure of the data.
REQUIREMENTS ON A VERSIONING STRATEGY
A versioning strategy must take into consideration:
- changes in the semantics of the data
- changes in the relationships of the data
- changes in the syntax of the data
When data is in an XML format then a versioning strategy must
implement:
- versioning a data dictionary
- versioning a Schematron schema
- versioning an XML Schema, Relax NG schema, or DTD
QUESTIONS
a. Do you agree with the three categories of change?
b. Do these categories represent all types of change?
c. Do you agree that a versioning strategy must address semantic,
relationship, and syntactic changes?
/Roger
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]