OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] RE: defining xml diff/changes in xml : XUpdate etc

[ Lists Home | Date Index | Thread Index ]

On Apr 12, 2006, at 20:47, Adrian Mouat wrote:
> The IETF had a bof on the subject of xml patching - the notes can  
> be found here:
> http://www3.ietf.org/proceedings/05nov/xmlpatch.html
> Basically there seems to be no current support for the creation of  
> a standard.

Have you taken a look at REX? It is not intended to be a generic XML  
patch language, but since the list of supported events is to be made  
open-ended, you could devise a set of events that correspond to  
patching operations. It may not be an ideal solution, but it does  
have strong support in terms of moving the standard forward so piggy- 
backing that might be a good idea.

> Michael Kay wrote:
>> * must the effect of applying diffs be independent of the order in  
>> which
>> they are applied?
> Surely impossible?? I can't add a node to a subtree that doesn't  
> exist. Or do you have a completely different format in mind?

It depends on whether you have the constraint of being able to create  
a WF XML document at each step, or if your patches can work on  
intermediate in-memory representations that may not hold the entire  
tree and may have ghost nodes. IIRC in theory MPEG-B updates can send  
you fragment updates that are inside nodes that you don't yet have  
(you would use the path to create stubs). In practice I'm not sure  
it's fully supported, but I can ask.

>> * do diff files need to be human-readable?
> I think not - they can be transformed into human readable formats.

I guess the question is also about whether they have to be XML. I  
think it's best but then I'm an integrist :)

>> * do diff files need to be small?
> Is XML ever?

Yes of course, just use an efficient XML format! Oh wait, it's not  
Friday, sorry.

>> * what kind of changes need to be diff'ed? Do they include, for  
>> example,
>> renaming of nodes? Do they include any bulk changes, such as  
>> deleting all
>> instances of a particular attribute? Do they include changes at  
>> the lexical
>> level, e.g. changing the expansion text of an internal entity? Do  
>> they
>> include DTD changes?DUL doesn't handle expressions like this, and  
>> I don't think it should - leave that to XQuery update.
> Entities are a hard question - there are even more questions if you  
> consider whether they should be resolved or not. DTD changes are  
> not supported in DUL.

I would personally opt for supporting only what the XPath DM  
supports, but I realise that this limits some of the use cases.

The EXI WG is working on a similar issue related to the fidelity of  
efficient XML encodings, which is basically the issue of the Infoset,  
namely what "matters" in an XML document. XML itself is defined  
entirely at the syntax level, but for some problems if you stick just  
to that you end up with good old gzip (for the efficient XML case) or  
good old diff (for the diff case). Presumably there are use cases  
that require more than what those options can bring to the table,  
which is where things get interesting. Currently I'm working on a  
scale measuring fidelity along the following lines. It is meant to  
evaluate efficient XML formats, but I think it could be usefully  
adapted to work on XML diff languages:

  -1: does not support "very basic" parts of the Infoset, such as PIs  
or comments
   0: supports what can be captured by the Infoset, except notations  
(and perhaps unresolved entities -- under discussion)
   1: supports everything that is captured by the Infoset
   2: supports the Infoset plus items that the Infoset does not take  
into account but that cannot be discounted as purely syntactic (e.g.  
element and attribute declarations)
   3: supports the above plus some completely syntactic constructs,  
such as CDATA sections, all the way to perhaps attribute quote  
characters, the variants in empty elements, or the amount of space  
between attributes or between target and data.

There's still some fair amount of fuzz in there of course, but I'd be  
very interested in feedback on the matter.

FYI REX could support renaming (by transmitting the corresponding DOM  
mutation events) and batch changes (by using an XPath selector that  
matches several nodes -- this is currently in the draft but I think  
it'll be dropped).

Robin Berjon
    Senior Research Scientist
    Expway, http://expway.com/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS