OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Parallel tree traversal

[ Lists Home | Date Index | Thread Index ]

Robert Koberg [mailto:rob@koberg.com] writes:
> 
> Hi Peter, [responses/questions inline]
> 
> Hunsberger, Peter wrote:
> 
> > 
> > There are many attributes on each element giving various pieces of 
> > metadata, and in particular for this case, determining how a 
> > particular node gets presented in the menu. For a given metadata 
> > structure we need to map many possible data items, eg:
> > 
> > <data>
> > 	<a id="1"/>
> > 	<b id="2" idref="1"/>
> > 	<c id="3" idref="2/>
> > 	<d id="4" idref="3"/>
> > 	<e id="5" idref="4"/>
> > 	<e id="6" idref="4"/>
> > 	<c id="7" idref="2/>
> > 	<f id="8" idref="1"/>
> > 	<a id="9"/>
> > 	<b id="10" idref="9"/>
> > 		...
> > </data>
> > 
> > Any given metadata node may not have any data associated 
> with it.  If 
> > it does, it must have a parent data item.  Using the data 
> structure I 
> > can determine that, in this, case I have to present a menu 
> that has a 
> > structure like:
> > 
> > <menu>
> > 	<a id="1">
> > 		<b id="2">
> > 			<c id="3"/>
> > 				<d id="4">
> > 					<e id="5"/>
> > 					<e id="6"/>
> > 				</d>
> > 			</c>
> > 			<c id="7"/>
> > 		</b>
> > 		<f id="8"/>
> > 	</a>
> > 	<a id="9">
> > 		<b id="10">
> > 		...
> > 	</a>
> > </menu>
> > 
> > The current strategy is to track the current id and pass it to a 
> > recursive XSLT template selecting all the data nodes that have an 
> > idref that is equal.
> 
> 
> You could create a key based on the idref attribute:
> 
> <xsl:key name="idrefs" match="*" use="@idref"/>
> 
> and apply templates on the matches from something like:
> 
> <xsl:variable
>    name="focus_children"
>    select="key('idrefs', $focus_idref)"/>
> 
> <xsl:apply-templates match="$focus_children"/>
> 
> Does this solve your problem (needing a nodeset function)?
 
I should have been clearer.  The nodeset function problem doesn't occur
when I use the id/idref solution.  I'm currently evaluating a 2nd
solution that builds the data tree as a hierarchy instead of a flat data
structure related by id/idref.  That's when I have the nodeset issue.
In this second case the real problem turns out to be selection of the
child nodes to pass on to the next recursion. I had a choose block that
looked at various metadata conditions and did things like (pseudo code):

	variable name="data"
		choose
			when @group = true
				select data/*[@type = $type]
			otherwise
			  	select data/*[local-name() = $name]

After sending the last e-mail I realized I could break this apart,
duplicate a couple of small chunks of code once for each metadata
use-case (5 of them as it turns out), and pass the data directly on to
the next recursion.  This avoids the need for using nodeset function,
however, I still run out of memory on large trees.
	
The use of keys is likely applicable for the first version of the
solution and would help speed things up there.  If I give up on the 2nd
solution I'll have to go back and see if I can use keys.

> 
> > 
> > It turns out that it would be easy to generate the tree 
> structure for 
> > the data directly.  However, I can't just use it directly, since if 
> > there is a missing data item I may still have to insert a 
> menu item to 
> > create a new instance (depending on the metadata).
> 
> 
> I am not understanding this. Are you trying to do all 
> operations on the 
> flat structure or are you first transforming to the 
> hierarchy? 

That's me not being clear that there are two different solutions I'm
evaluating.  If I use a hierarchical data instance I can't just use it
directly since I still have to understand it in the context of the
metadata

> What is 
> the difference between inserting a missing menu item in the hierarchy 
> versus the flat structure? I guess I don't understand why you 
> would not 
> keep this in a hierarchy to begin with (are you doing other 
> things not 
> explained here?).
 
Yes, but your confusion is only that I have either the flat data
structure or a hierarchical version.  I'm trying to determine if I can
speed things up using the hierarchical version (and not generate the
flat structure).  So far, the answer is yes, but I run out of memory on
large data structures.  Using keys on the id/idref version may negate
the speed gain of the hierarchical version...

> Before i comment further I would need to understand this.
> 
> best,
> -Rob
> 
> p.s. didn't we talk about this on cocoon-dev a few years ago? :)

I've talked about this on cocoon-dev, but I believe you may have helped
with the grouping problem on the xslt list when I first started
exploring this problem?

> 
> 
> > So, I need to track
> > the metadata as well. This is conceptually easy, just select the 
> > children of the current element and recursively pass it on to the 
> > template.  However, practically, this is expensive in XSLT 
> 1.0 since 
> > to subsequently use a parameter created in this manner it 
> appears you 
> > have to use the nodeset function.  Eg:
> > 
> > 	<xsl:apply-templates select="*" mode="menu">
> >       	<xsl:with-param name="data" 
> select="exslt:nodeset($data)/*"/>
> > 	</xsl:apply-templates>
> > 
> > Now, in this particular case we are using Saxon, so this raises two 
> > specific questions: 1) if I just declare the transform to 
> be XSLT 1.1 
> > and skip using the nodeset function will I see any 
> performance gains?
> > 2) Would using Saxon 7 and XSLT 2 have any performance gains?
> > 
> > A more general question is whether anyone sees any other 
> strategy for 
> > managing the above, or whether anyone has any opinions on which 
> > approach should work better?  I can tell you that 
> experiments with the 
> > second approach traversing the metadata and passing the data as a 
> > parameter seem to exhaust memory for large data trees 
> though it works 
> > fine for smaller trees. I'm about to see if I can do the 
> inverse from 
> > the above
> > example: parse the data and pass the metadata as a 
> parameter, but this
> > is a non-trivial transform (lots of metadata affecting many 
> things) so I
> > thought I'd check first if anyone has a definite opinion 
> about which way
> > to proceed.
> > 
> > Peter Hunsberger
> 
> 
> 





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS