OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Is CVS A Practical Means to Manage XML Versions InA Production Environment

On 09/17/2011 03:02 PM, David Lee wrote:
> What it *doesn't* do is 'intelligent' diffs and versioning.    A classic
> case is someone may simply load a xml file into and editor and save it,
> without changing any 'xml stuff' but whitespace may change and cause the
> file to be versioned and presumed 'different'.   Does that matter ? It
> depends on your needs.

Yes, it all depends. I think the OP might not be interested in
tree-aware or XML-aware diffs and integrated query mechanisms. That is
also ACID properties which are usually guaranteed through transactions
in database systems -- or CAP in NoSQL systems. But yeah, I'm more
interested in the database/storage part and maybe a bit biased ;-)

> For our needs it doesn't matter at all.  We just needed document management
> at a document level and as long as the files are not corrupted and we can
> assign unique versions and label and pull them, it works great.     Now if
> you want to question say 'what XML element changed and by what'  then a text
> based version control won't answer that, but you can use other (non version
> control) XML diff tools, pull the 2 versions and diff them.   Also if you
> want the system to  not create a new version unless the document has
> semantically changed it won't do that.

Hm, you also have to remember that such diff tools really compute diffs
and can't be used for change detection (for instance they may try to
generate minimal edit scripts, but an optimal tree-to-tree correction
algorithm is known to have a CPU runtime complexity of O(n^3) and it may
still not be what a user really has changed, that is most of the tools
use some kind of heuristics to speed up the diff-computation but may in
certain cases fail horribly, for example data-oriented XML is almost
always a problem). If you want to determine the changes a user really
has done you must have unique node IDs, whereas XML documents usually
don't have unique IDs.

ID-based algorithms are usually also much quicker.

Furthermore even XML diff tools aren't sufficient if the user wants to
have an overview about the changes, but all of that may not be what the
OP wants anyway ;-)

But as a side note -- with time aware XPath or even XQuery-extensions a
versioned XML-DBS can be used to analyse time-dependent data, which
might be significant in many areas. What do others think?

best regards,

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS