OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Fwd: RE: [xml-dev] Push and Pull?

[ Lists Home | Date Index | Thread Index ]

1/25/02 3:26:35 PM, "Sterin, Ilya" <Isterin@ciber.com> wrote:

>I never said it was, I was refering to the underlying parser for DOM.  In 
my
>understanding...
>
>push - an implementation where the parser alerts the program of different
>tokens, in our case different XML nodes (tags, data, etc...).
>
>pull - the parser collects the data to later create a data structure
>representation of it for the program.
>
>
>That's what I've got as an understanding of it.  If I am wrong is not due 
to
>my lack of reading about the subject, but rather to the lack of a clear
>explanation on a programmers level, not an English/Philosopher theory on 
the
>subject of a correct definition.

The way I'd use the terms:

"push" means that the flow of control (the "main loop") resides within the 
parser.  "pull" means 
that  the flow of control resides within the code that calls the parser.  
This is orthogonal to 
distinctions based on the *amount* of information communicated by the 
parser to the calling code at 
any point.

To use a Perl example, "pull" corresponds to either the typical 
experienced programmer's

while (<INPUT>) {
# do something with $_
}

(this corresponds to what I think the original poster was asking for; 
XML::TokeParser fills this 
role)

or the novice's

@stuff=<INPUT>;
for (@stuff) {
# do soemthing with $_
}

(which corresponds to building a DOM and iterating over it)

"push" would correspond to the relatively little-used, in Perl, paradigm 
of supplying an input 
object with a reference to a sub to be called each time a line from the 
input becomes readable.

Note that in the "pull" mode corresponding to my first example, it's 
possible for the code to keep 
state by flow of control.  For example, if one encounters a certain 
element, one might start a 
nested loop that deals with its sub-elements.  In "push" mode, it's 
necessary to set flags or state 
variables and dispatch on them, since the same callback gets invoked for 
each element (either that, 
or swap callbacks, usually using an explicit stack).

It is my belief that most programmers find the "pull" style easier to use, 
since it corresponds to 
the taught-early-on notion of an input loop.  Using the "push" style is 
more analogous to writing a 
GUI application, which most programmers find hard to do at first (and 
which the majority of 
programmers do by using Visual Basic or the Microsoft Foundation Classes; 
programmers writing GUI 
applications for non-Microsoft OSs are a numerical minority).  And even 
GUI programmers don't tend 
to think of *reading files* in "push" terms.

So I'd classify parsers into pull vs. push on one dimension (where the 
flow of control resides) and 
event-based vs. tree-based on another, orthogonal, dimension (how much 
information is communicated 
at a time).  It's certainly possible, BTW, to have a parser that is both 
tree-based and push; think  
of XML::Twig.


-------- End of forwarded message --------







 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS