OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] XML document/message versioning -- possible model?

[ Lists Home | Date Index | Thread Index ]
  • To: "Ian Graham" <ian.graham@utoronto.ca>,<xml-dev@lists.xml.org>
  • Subject: RE: [xml-dev] XML document/message versioning -- possible model?
  • From: "Dare Obasanjo" <dareo@microsoft.com>
  • Date: Sun, 29 Sep 2002 22:48:18 -0700
  • Thread-index: AcJn+wnZT9IxN59yRZCnpVL4r0TBkgARFFQc
  • Thread-topic: [xml-dev] XML document/message versioning -- possible model?

Sorry to burst the bubble but versioning in XML applications is neither well understood nor a solved problem. My opinion on this is in the TAG archive[0] in a  recent thread on the topic[1]. 
 
Your question seems to mix validating the document interchangeably with performing whatever processing you need to do on the document. If you use xsi:schemaLocation then there isn't a problem for validation since the XML instance tells you where its schema is located. As for whether to use schema locations as a versioning mechanism to tell if you can process the namespace, I'd suggest using version numbers instead. Version numbers are very useful because you can perform comparisons to tell if the schema revision is one your application doesn't know about. 
 
For example, application A understands how to process documents with elements from the "http://www.example.org"; namespace in revisions 1.0, 1.1, 2.0 etc. up to version 3.0 of the schema. If elements from that namespace from revision 2.1 or revision 4.7 of the schema show up there is an easy way to know if they can be handled and the code is a lot simpler than switching on URI names. 
 
Of course, you could structure your schema locations in a manner that allows such comparisons (e.g. "http://www.example.org/2002/09/29/schema.xsd <http://www.example.org/2002/09/29/schema.xsd> " ) and you'd get similar functionality although you would be overloading the meaning of the xsi:schemaLocation attribute which is unwise and should very well documented in your business logic and to all partners involved. 
 
 
[0] http://lists.w3.org/Archives/Public/www-tag/2002Sep/0092.html
[1] http://lists.w3.org/Archives/Public/www-tag/2002Sep/0082.html

	-----Original Message----- 
	From: Ian Graham [mailto:igraham@ic-unix.ic.utoronto.ca] 
	Sent: Sun 9/29/2002 1:45 PM 
	To: xml-dev@lists.xml.org 
	Cc: 
	Subject: [xml-dev] XML document/message versioning -- possible model?
	
	


	I have a question about best practices for version identification
	in XML documents/messages. I'll start by explaining the situation
	I'm trying to fathom, and will finish off with my tentative thoughts
	... and then hope discussion here will help me understand this
	better. Hopefully this issue is well understood (and has been
	solved long ago), in which case I can happily take that
	model, and move on.
	
	I have a bunch of applications exchanging XML messages, the
	messages employing multiple namespaces.  The models for
	each namespace are, to some degree, independently developed,
	and are formally defined defined using XML Schema. As each
	namespace 'module' evolves, changes in the module will be
	reflected in new, updated versions of the associated
	schema, to be archived at (and accesible from) unique,
	well-defined URIs.
	
	The namespace URIs themselves will only change when there are
	'substantial' semantic changes to the module (...intentially
	avoiding discussion of what 'substantial' means...)
	
	When an application receives a message, it needs to determine
	if the message can be processed, and how.  The first cut is
	to ask: "Do I know the namespaces?" If I do, then I continue.
	If I don't, then I ignore the unknown namespaces, and have
	relatively straightforward rules for determining if and how
	I can proceed.
	
	However, that's a very coarse level of selection. We can
	already envisage cases where small variations in a schema
	can lead to changes in message structure or content that can
	render the message unacceptable to some recipient. But the
	recipient won't know this has happened, as the namespaces
	are unchanged, and there's nothing else in the message to
	indicate the difference.
	
	So the team codes _very_ defensively, and hope for the best.
	
	Ideally I'd (the application) would like to know which
	'version' of the message I've received, and then choose
	whether or not to process it: and hopefully have a
	better sense of which parts of the message are 'safe'
	(consistent with the models I know), and which are potential
	problems.
	
	But that identifier can't be a simple version number, as a
	single number can't easily identify all the ways a new
	'version' may arise.
	
	First thought: the version should be a 'version set' consisting
	of the set of namespace URI / schema URI pairs relevant to
	the namespaces in the message:
	
	  {nsURI, schemaURI}
	
	I could then use a schemaLocation attribute to explicitly
	include this information in messages, and use the _value_
	as the version vector.
	
	Advantages: uses existing mechanism to pass version information;
	            doesn't require centralized management of all the
	            schema files;
	Problems:   schemaLocation usage is not well specified; not
	            clear (to me) how to combine the referenced schema
	            files to validate the message data;  also...
	            several on this list have suggested schemaLocation
	            is Not a Good Idea In The First Place ....
	
	Second thought:  Require definition of the message structure
	via a master schema file that imports all the schemas relevant
	to all namespaces in the message. The URI for this file becomes
	a unique version identifier for the message type.
	
	Advantages: seems simpler in some weird way
	Problems:   still coarse grained - recipient can't know which
	            schemas correspond to which namespace without
	            accessing the master schema; requires centralized
	            management (to create/assign/design 'master' schema
	            files); still need to use schemaLocation to pass
	            on schema URI
	
	
	I'm leaning to the former approach,  but am willing to be
	convinced otherwise.  So comments / criticisms / suggestions are
	more than welcome!
	
	Also (and unfortunately), I don't have mail/list access
	from work (I'm posting from home) ... so please don't confuse
	silence with consent ;-) -- I'll be back to follow up...
	
	Ian
	
	
	-----------------------------------------------------------------
	The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
	initiative of OASIS <http://www.oasis-open.org>
	
	The list archives are at http://lists.xml.org/archives/xml-dev/
	
	To subscribe or unsubscribe from this list use the subscription
	manager: <http://lists.xml.org/ob/adm.pl>
	
	





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS