OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [Fwd: Walking the DOM (was: XML APIs)]

[ Lists Home | Date Index | Thread Index ]
  • From: Jean-Claude Wippler <jcw@equi4.com>
  • To: John Cowan <cowan@locke.ccil.org>
  • Date: Tue, 03 Nov 1998 22:39:34 +0100

John Cowan wrote:

> Stephen R. Savitzky wrote:
> 
> > [T]he classic algorithm for traversing a tree is:
> >
> > traverse(node) {
> >   visit(node);
> >   if (node.firstChild != null) traverse(node.firstChild);
> >   if (node.nextSibling != null) traverse(node.nextSibling);
> > }
> 
> The trouble with that algorithm is that it is recursive.  It will
> blow up if the tree is sufficiently deep.  Indeed, in
> languages that cannot be relied on to do tail recursion, like
> Java, it will blow up if the tree is merely sufficiently wide.
> 
> Furthermore, if there is any end-of-node processing to do, such as
> emitting an end tag indication, then the algorithm is no longer
> even partly tail recursive and will blow up on both depth and
> width even in safe-tail-recursion languages.
> 
> The algorithm I use in DOMParser, therefore, is non-recursive:
[...]

The way I load an XML document into MetaKit, it uses an explicit stack
with exactly one "int" per level.  I think you'll agree that this amount
of "stack" use makes the approach suitable for any document (once I add
some tests - this is just an experiment for now).  Source code is at:
	http://www.equi4.com/metakit/xml/mk4xml.cpp

After that, you end up with an on-demand loaded document, which is
indexable so there is no scanning at all when accessing this data. 
Every child node is in an indexable "subview".  And when you *do* need
traversal, you can again use the same one-int-per-level stack approach.

This works equally well in the case of end-node processing, BTW.

> Because of the reliability of this algorithm vis-a-vis the recursive
> one, I believe it should be the standard way of walking DOM trees,
> and therefore it is essential that DOM implementations make the
> structural access methods fast.

By reliability, do you mean "not blowing up its stack"?

As you can see, there are more ways than one to skin this cat.  It seems
to me that standardizing in the way you propose will prevent the use of
other techniques - such as storing XML as a MetaKit datafile and using
explicit recursion.

-- Jean-Claude

________________________________________________________________________
Jean-Claude Wippler    MetaKit home page - http://www.equi4.com/metakit/
Equi4 Software         "Portable database software for a changing world"

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS