Lists Home |
Date Index |
- To: "Ian Graham" <firstname.lastname@example.org>,<email@example.com>
- Subject: RE: [xml-dev] XML document/message versioning -- possible model?
- From: "Dare Obasanjo" <firstname.lastname@example.org>
- Date: Sun, 29 Sep 2002 22:48:18 -0700
- Thread-index: AcJn+wnZT9IxN59yRZCnpVL4r0TBkgARFFQc
- Thread-topic: [xml-dev] XML document/message versioning -- possible model?
Sorry to burst the bubble but versioning in XML applications is neither well understood nor a solved problem. My opinion on this is in the TAG archive in a recent thread on the topic.
Your question seems to mix validating the document interchangeably with performing whatever processing you need to do on the document. If you use xsi:schemaLocation then there isn't a problem for validation since the XML instance tells you where its schema is located. As for whether to use schema locations as a versioning mechanism to tell if you can process the namespace, I'd suggest using version numbers instead. Version numbers are very useful because you can perform comparisons to tell if the schema revision is one your application doesn't know about.
For example, application A understands how to process documents with elements from the "http://www.example.org" namespace in revisions 1.0, 1.1, 2.0 etc. up to version 3.0 of the schema. If elements from that namespace from revision 2.1 or revision 4.7 of the schema show up there is an easy way to know if they can be handled and the code is a lot simpler than switching on URI names.
Of course, you could structure your schema locations in a manner that allows such comparisons (e.g. "http://www.example.org/2002/09/29/schema.xsd <http://www.example.org/2002/09/29/schema.xsd> " ) and you'd get similar functionality although you would be overloading the meaning of the xsi:schemaLocation attribute which is unwise and should very well documented in your business logic and to all partners involved.
From: Ian Graham [mailto:email@example.com]
Sent: Sun 9/29/2002 1:45 PM
Subject: [xml-dev] XML document/message versioning -- possible model?
I have a question about best practices for version identification
in XML documents/messages. I'll start by explaining the situation
I'm trying to fathom, and will finish off with my tentative thoughts
... and then hope discussion here will help me understand this
better. Hopefully this issue is well understood (and has been
solved long ago), in which case I can happily take that
model, and move on.
I have a bunch of applications exchanging XML messages, the
messages employing multiple namespaces. The models for
each namespace are, to some degree, independently developed,
and are formally defined defined using XML Schema. As each
namespace 'module' evolves, changes in the module will be
reflected in new, updated versions of the associated
schema, to be archived at (and accesible from) unique,
The namespace URIs themselves will only change when there are
'substantial' semantic changes to the module (...intentially
avoiding discussion of what 'substantial' means...)
When an application receives a message, it needs to determine
if the message can be processed, and how. The first cut is
to ask: "Do I know the namespaces?" If I do, then I continue.
If I don't, then I ignore the unknown namespaces, and have
relatively straightforward rules for determining if and how
I can proceed.
However, that's a very coarse level of selection. We can
already envisage cases where small variations in a schema
can lead to changes in message structure or content that can
render the message unacceptable to some recipient. But the
recipient won't know this has happened, as the namespaces
are unchanged, and there's nothing else in the message to
indicate the difference.
So the team codes _very_ defensively, and hope for the best.
Ideally I'd (the application) would like to know which
'version' of the message I've received, and then choose
whether or not to process it: and hopefully have a
better sense of which parts of the message are 'safe'
(consistent with the models I know), and which are potential
But that identifier can't be a simple version number, as a
single number can't easily identify all the ways a new
'version' may arise.
First thought: the version should be a 'version set' consisting
of the set of namespace URI / schema URI pairs relevant to
the namespaces in the message:
I could then use a schemaLocation attribute to explicitly
include this information in messages, and use the _value_
as the version vector.
Advantages: uses existing mechanism to pass version information;
doesn't require centralized management of all the
Problems: schemaLocation usage is not well specified; not
clear (to me) how to combine the referenced schema
files to validate the message data; also...
several on this list have suggested schemaLocation
is Not a Good Idea In The First Place ....
Second thought: Require definition of the message structure
via a master schema file that imports all the schemas relevant
to all namespaces in the message. The URI for this file becomes
a unique version identifier for the message type.
Advantages: seems simpler in some weird way
Problems: still coarse grained - recipient can't know which
schemas correspond to which namespace without
accessing the master schema; requires centralized
management (to create/assign/design 'master' schema
files); still need to use schemaLocation to pass
on schema URI
I'm leaning to the former approach, but am willing to be
convinced otherwise. So comments / criticisms / suggestions are
more than welcome!
Also (and unfortunately), I don't have mail/list access
from work (I'm posting from home) ... so please don't confuse
silence with consent ;-) -- I'll be back to follow up...
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of OASIS <http://www.oasis-open.org>
The list archives are at http://lists.xml.org/archives/xml-dev/
To subscribe or unsubscribe from this list use the subscription