OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Profiling, diff and change tracking best practices?

Hi Lech,

Funnily enough I have just started thinking about this for my own project with a similar use-case - i.e. understanding the changes between two different baselines of an XML document or XML document set.

My high-level thoughts so far are:

1.] Add suitable meta-data attributes (e.g. version/create and modify date/author) to fairly coarse grained components within the XML data model.
2.] Create a baseline of the document or set of XML documents set by:
2.1] Creating a fairly light weight XML file (perhaps using XSLT) that only contains this meta-data. Save this to disk (i.e. create a memento of the meta-data)
2.2] Saving a copy of the original XML in a version control system/file system where it will not be edited further.
3.] Later on when trying to do a diff. between the original baseline and current:
3.1] Using the same mechanism as in step 2.1 create a new memento of the current XML document or set of XML documents
4.] Compare the two mementos reporting on changes - if required the baseline copy of the XML can be used to compute exactly what content has changed (I think you need add/delete and update) between the two versions.

I am still undecided whether both the memento and document copy are required - logically the memento is not actually required. However the lightweight memento may prove useful if:
Anyway I have only had early thoughts on the subject so would glady listen to any other suggestions that the community has to offer.

Kind regards,

Michael Odling-Smee

On Thu, Oct 1, 2009 at 3:44 PM, Lech Rzedzicki <xchaotic@gmail.com> wrote:
Hi all.

I am at a fortunate stage where we are redesigning our XML schema so
that it fits our requirements better.
To give you an idea of the XML we're dealing with, it's loosely based
on DocBook and used for multi-channel publishing.
Some frequent scenarios include updating XML with new content,
comparing versions, different languages, sending diffs to tranlation,
but also producing slight variations depending on the output. Tracking
changes (by being able to see what's been added and deleted) is also a
nice to have feature.
Basically what I aim to put in place is structures to help with these
function that are not too verbose to overwhelm editors, yet powerful
enough for 'future' scenarios.

My initial thoughts are to employ xml:id attributes on block-level
elements and add a set of attributes for each facet of profiling,
possibly reusing DocBook attributes such as condition, version,
audience, but my fear is that it won't powerful enough in the future.

I would love to hear your general thoughts on best practices in this
area of managing XML content and specifically on:

1. How low should we go with id's on elements? My main concern here is
making diffs as easy as possible and possibly identifying chunks of
xml that are as small as possible, making translation cheaper. On the
other hand should I be bother at all about the performance, since all
the documents are size-limited to a book size of ca 1000 pages(a few
MB of XML)?
2. Use a possible verbose set of elements/attributes on the elements
directly or use a meta-attribute that links to an attribute/element
set in a secondary file? (less verbose but more complex)
3. Are 'add' and 'remove' sufficient change tracking marks to cover
all scenarios? (I think any more complex edits such as update can be
built up from those two)?

I really hope I can get some good feedback from you and thanks in
advance for that,

Lech Rzedzicki


XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS