[
Lists Home |
Date Index |
Thread Index
]
On Apr 12, 2006, at 20:47, Adrian Mouat wrote:
> The IETF had a bof on the subject of xml patching - the notes can
> be found here:
>
> http://www3.ietf.org/proceedings/05nov/xmlpatch.html
>
> Basically there seems to be no current support for the creation of
> a standard.
Have you taken a look at REX? It is not intended to be a generic XML
patch language, but since the list of supported events is to be made
open-ended, you could devise a set of events that correspond to
patching operations. It may not be an ideal solution, but it does
have strong support in terms of moving the standard forward so piggy-
backing that might be a good idea.
> Michael Kay wrote:
>> * must the effect of applying diffs be independent of the order in
>> which
>> they are applied?
>
> Surely impossible?? I can't add a node to a subtree that doesn't
> exist. Or do you have a completely different format in mind?
It depends on whether you have the constraint of being able to create
a WF XML document at each step, or if your patches can work on
intermediate in-memory representations that may not hold the entire
tree and may have ghost nodes. IIRC in theory MPEG-B updates can send
you fragment updates that are inside nodes that you don't yet have
(you would use the path to create stubs). In practice I'm not sure
it's fully supported, but I can ask.
>> * do diff files need to be human-readable?
>
> I think not - they can be transformed into human readable formats.
I guess the question is also about whether they have to be XML. I
think it's best but then I'm an integrist :)
>> * do diff files need to be small?
>
> Is XML ever?
Yes of course, just use an efficient XML format! Oh wait, it's not
Friday, sorry.
>> * what kind of changes need to be diff'ed? Do they include, for
>> example,
>> renaming of nodes? Do they include any bulk changes, such as
>> deleting all
>> instances of a particular attribute? Do they include changes at
>> the lexical
>> level, e.g. changing the expansion text of an internal entity? Do
>> they
>> include DTD changes?DUL doesn't handle expressions like this, and
>> I don't think it should - leave that to XQuery update.
> Entities are a hard question - there are even more questions if you
> consider whether they should be resolved or not. DTD changes are
> not supported in DUL.
I would personally opt for supporting only what the XPath DM
supports, but I realise that this limits some of the use cases.
The EXI WG is working on a similar issue related to the fidelity of
efficient XML encodings, which is basically the issue of the Infoset,
namely what "matters" in an XML document. XML itself is defined
entirely at the syntax level, but for some problems if you stick just
to that you end up with good old gzip (for the efficient XML case) or
good old diff (for the diff case). Presumably there are use cases
that require more than what those options can bring to the table,
which is where things get interesting. Currently I'm working on a
scale measuring fidelity along the following lines. It is meant to
evaluate efficient XML formats, but I think it could be usefully
adapted to work on XML diff languages:
-1: does not support "very basic" parts of the Infoset, such as PIs
or comments
0: supports what can be captured by the Infoset, except notations
(and perhaps unresolved entities -- under discussion)
1: supports everything that is captured by the Infoset
2: supports the Infoset plus items that the Infoset does not take
into account but that cannot be discounted as purely syntactic (e.g.
element and attribute declarations)
3: supports the above plus some completely syntactic constructs,
such as CDATA sections, all the way to perhaps attribute quote
characters, the variants in empty elements, or the amount of space
between attributes or between target and data.
There's still some fair amount of fuzz in there of course, but I'd be
very interested in feedback on the matter.
FYI REX could support renaming (by transmitting the corresponding DOM
mutation events) and batch changes (by using an XPath selector that
matches several nodes -- this is currently in the draft but I think
it'll be dropped).
--
Robin Berjon
Senior Research Scientist
Expway, http://expway.com/
|