XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] Data versioning strategy: address semantic, relationship, and syntactic changes?

Hi Folks,

Once again, many thanks for your excellent comments.  

Below I examine a tightly focused scenario (a common one, I think).  I
seek your thoughts on it.

SCENARIO

A web service is deployed.  It has a URL.  When the URL is dereferenced
(by a client), the web service returns some data.  

The data has well-defined: 

- syntax, expressed by a grammar-based language (XML Schema, Relax NG,
or DTD)

- relationships, expressed by a Schematron schema

- semantics, expressed by a data dictionary, ontology, English prose,
or some combination thereof. 

Over time the data changes.  That is, the syntax, relationships, and
semantics change.

There are N clients.  A client dereferences the URL to retrieve the
data. Each client is a "sink" of the data.  That is, a client consumes
the data, it does not forward it to another party.

PROBLEM

Identify ways for the web service to version the data. 

OTHER STRATEGIES POSSIBLE

Below are two versioning strategies.  Others are possible, but I think
that these will be good to start with.

DATA VERSIONING STRATEGY #1 - ONE ACTIVE VERSION OF THE DATA

The web service supports only one version of the data at a time.  The
web service has only one URL.  When the web service updates to a new
version, it discontinues support for the old version.  Clients are
required to update in lockstep. Changes to the data are based on
changing data requirements.  

Advantages

1. The data is unconstrained in how it evolves.  That is, it is not
constrained to only backward- or forward-compatible changes. 

This is good, as it allows the data to evolve based on application
requirements and not on technology limitations.

2. The machinery behind the web service - application code and database
- has to handle only one version.

This is good, there is no redundancy, which may minimize cost.

4. The web service's business process is simply: "Here's the data."
There is no need for creating extensible schemas and no need to tell
clients "Accept unknown elements." 

For my clients, extensible schemas are a security risk, so having
schemas that specify exactly what is permitted is good.

Disadvantages

1. Clients may not be able to migrate at the same pace as the web
service.  Dawdling clients will be locked out of using the web service.

2. Each new version may entail large and costly changes for client
applications.  

If the data was changed based on technology constraints, using
backward- or forward-compatible design approaches, rather than based
purely on changes to data requirements (which may not result in
backward- or forward-compatible changes), it may make changes to client
applications more incremental and less costly. (Note: this is
conjecture.  Evidence is needed. What are your thoughts on this?)

3. Many little changes would be overwhelming to the clients; so, due to
pressure from clients, the web service will likely be forced to make
large, infrequent changes.  This may be bad for clients who need "that
little piece of data ASAP."


DATA VERSIONING STRATEGY #2 - MULTIPLE ACTIVE VERSIONS OF THE DATA

Every time a new version of the data is created, a new URL is created.
Old versions are maintained.  Clients use whatever version they desire.
Changes to the data are based on changing data requirements.

Advantages

1. Clients can upgrade to a new version at their leisure. No lockstep
migration.

This is good, as it allows clients to upgrade when they have the
necessary resources and the need.   

2. The data is unconstrained in how it evolves.  That is, it is not
constrained to only backward- or forward-compatible changes. 

This is good, as it allows the data to evolve based on application
requirements and not on technology limitations.

3. The web service and clients are decoupled, each can evolve at their
own pace.

This is good, as the web service's advances are not hindered by
dawdling clients.

4. The web service's business process is simply: "Here's the data."
There is no need for creating extensible schemas and no need to tell
clients "Accept unknown elements." 

For my clients, extensible schemas are a security risk, so having
schemas that specify exactly what is permitted is good.

5. Rapid changes to the data can be made without impacting clients.

This is good because oftentimes a small additional piece of data is
needed, and it's needed ASAP.

Disadvantages

1. The machinery behind the web service - application code and database
- must handle multiple concurrent versions.

This is bad, as there may be redundancy and extra cost.

2. Each new version may entail large and costly changes for client
applications.  

If the data was changed based on technology constraints, using
backward- or forward-compatible design approaches, rather than based
purely on changes to data requirements (which may not result in
backward- or forward-compatible changes), it may make changes to client
applications more incremental and less costly. (Note: this is
conjecture.  Evidence is needed. What are your thoughts on this?)

QUESTIONS

1. Can you think of other advantages and disadvantages of these two
versioning strategies?

2. The above discussion pits:

- changing data due purely to changes in data requirements

versus

- changing data based on a desire to keep the data backward- or
forward-compatible

"Evolving data due to application requirements versus technology
limitations."

Is it fair to say that these represent competing desires?

/Roger


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS