[
Lists Home |
Date Index |
Thread Index
]
On 3/14/06, Oleg A. Paraschenko <olpa@xmlhack.ru> wrote:
> Hello Peter,
>
> thank you for answer.
No problem, I'm surprised you haven't gotten any others.
> ...
> > >
> > > <x><b>some <i>dummy</i></b><i> text</i></x>,
> > >
> > > or something like it.
> >
> > If you look at what you are asking for in the most general case, I
> > believe the answer is not really: you want to to do generalized
> > merging of graphs (which is believed to be a NP complete problem).
>
> I didn't know it. But trees are much simplier than graphs, so the
> problem also should be much simplier.
Well, yes, but XML isn't always a tree. If you're going to manage the
XML by considering only simple element name, then yes, you may have a
tree. However, if you've got to consider things like name spaces then
you've may actually have a graph.
I haven't given it much thought, but it seems that even the general
case of merging non-structurally identical trees is NP complete.
Consider the fact that identically named nodes may lie on different
branches. The only way to evaluate whether two such nodes can in fact
be merged is to look at the relationship of such nodes to every other
node on the tree and decide if they match (whatever "match" might mean
in this case). As such, this looks to be a classic travelling
salesman traversal...
> > If your domain is limited and common (eg. XHTML ) then you might get
> > lucky and find some existing tools. Otherwise, I think you've got a
> > good use case for rolling your own with XSLT...
>
> In fact, I've already draftly implemented it to support my upcoming
> XTech paper on XSieve. Now I'm looking for other tools to compare them.
> The best I found are LMNL and just-in-time trees, but they provide another
> functionality.
>
Can you tell us a bit about the implementation? Is it for XHTML or
more generalized? Do you handle name spaces? Do the trees have to have
identical structures?
The reason I ask is because we do some of this over several well
defined cases (non-identical structure, but there are ways of
identifying unique nodes) and as we move forward we're evolving more
and more generic requirements that may eventually require us to try
and evaluate structural equivalence.
--
Peter Hunsberger
|