xml-dev - Re: [xml-dev] DESIGN PROPOSAL: Java XMLIterator

Re: [xml-dev] DESIGN PROPOSAL: Java XMLIterator

[ Lists Home | Date Index | Thread Index ]

To: xml-dev <xml-dev@lists.xml.org>
Subject: Re: [xml-dev] DESIGN PROPOSAL: Java XMLIterator
From: "Clark C . Evans" <cce@clarkevans.com>
Date: Tue, 18 Dec 2001 10:06:43 -0500
In-reply-to: <001701c18785$9ea07300$0e00a8c0@bkk.thaiopensource.com>; from jjc@jclark.com on Tue, Dec 18, 2001 at 12:34:10PM +0700
References: <3C1E6D9E.3070401@reutershealth.com> <001701c18785$9ea07300$0e00a8c0@bkk.thaiopensource.com>
User-agent: Mutt/1.2.5i

On Tue, Dec 18, 2001 at 12:34:10PM +0700, James Clark wrote:
| Perhaps the most fundamental decision in designing a pull API is
| whether the properties for each node are provided
| 
| (a) by methods on some sort of node object returned by the
| scanner/parser/iterator object
| 
| (b) by methods on the scanner/parser object itself; the scanner/parser
| object has methods to move to the next node

If one goes with (a), there seem to be two other choices:
  
  (1)  A single hierarchical iterator provided by
       the scanner/parser, where BEGIN/END tags  
       are presented for each node.

  (2)  A hierarchy of flat iterator, with "nodes" that
       have a "children" method that returns the 
       subordinate iterator.

It seems that both you and John Cowan have chosen (1)
and I was wondering if either of you had tried (2).  
I've tried both and find that (2) is much easier 
to code with.  

The primary argument against (2) is that the children()
method can only be called once if the iterator is over 
a sequential access medium.  

The primary advantage of (2) is that it uses nodes 
instead of tags, and also a nested iterator most often 
fits with the processing requirements and helps prevent 
bugs since a sub-function cannot accidently "consume"
events which it shouldn't be processing.  This is helpful
when composing a processor pipeline from multiple vendors.

I've found it also helps to have an "access" method which 
returns "random", "sequential" or "notaccessible", in case 
where the node is based on a random access medium, is 
sequential access but has not been read, and is sequential 
access and  has been read.   This interface can also be 
improved  with a "makeRandom" method, which loads the 
entire sub-tree in memory for random access.

Thus, (2) ends up being a hybrid DOM where the type of
access given to each node is made explicit.  

Best,

Clark

-- 
Clark C. Evans                   Axista, Inc.
http://www.axista.com            800.926.5525
XCOLLA Collaborative Project Management Software

References:
- DESIGN PROPOSAL: Java XMLIterator
  - From: John Cowan <jcowan@reutershealth.com>
- Re: [xml-dev] DESIGN PROPOSAL: Java XMLIterator
  - From: "James Clark" <jjc@jclark.com>

Prev by Date: s-expressions and XML was Re: [xml-dev] terra incognita
Next by Date: RE: [xml-dev] URI for class names and native data types
Previous by thread: Re: [xml-dev] DESIGN PROPOSAL: Java XMLIterator
Next by thread: RE: [xml-dev] DESIGN PROPOSAL: Java XMLIterator
Index(es):
- Date
- Thread