[
Lists Home |
Date Index |
Thread Index
]
1/25/02 3:26:35 PM, "Sterin, Ilya" <Isterin@ciber.com> wrote:
>I never said it was, I was refering to the underlying parser for DOM. In
my
>understanding...
>
>push - an implementation where the parser alerts the program of different
>tokens, in our case different XML nodes (tags, data, etc...).
>
>pull - the parser collects the data to later create a data structure
>representation of it for the program.
>
>
>That's what I've got as an understanding of it. If I am wrong is not due
to
>my lack of reading about the subject, but rather to the lack of a clear
>explanation on a programmers level, not an English/Philosopher theory on
the
>subject of a correct definition.
The way I'd use the terms:
"push" means that the flow of control (the "main loop") resides within the
parser. "pull" means
that the flow of control resides within the code that calls the parser.
This is orthogonal to
distinctions based on the *amount* of information communicated by the
parser to the calling code at
any point.
To use a Perl example, "pull" corresponds to either the typical
experienced programmer's
while (<INPUT>) {
# do something with $_
}
(this corresponds to what I think the original poster was asking for;
XML::TokeParser fills this
role)
or the novice's
@stuff=<INPUT>;
for (@stuff) {
# do soemthing with $_
}
(which corresponds to building a DOM and iterating over it)
"push" would correspond to the relatively little-used, in Perl, paradigm
of supplying an input
object with a reference to a sub to be called each time a line from the
input becomes readable.
Note that in the "pull" mode corresponding to my first example, it's
possible for the code to keep
state by flow of control. For example, if one encounters a certain
element, one might start a
nested loop that deals with its sub-elements. In "push" mode, it's
necessary to set flags or state
variables and dispatch on them, since the same callback gets invoked for
each element (either that,
or swap callbacks, usually using an explicit stack).
It is my belief that most programmers find the "pull" style easier to use,
since it corresponds to
the taught-early-on notion of an input loop. Using the "push" style is
more analogous to writing a
GUI application, which most programmers find hard to do at first (and
which the majority of
programmers do by using Visual Basic or the Microsoft Foundation Classes;
programmers writing GUI
applications for non-Microsoft OSs are a numerical minority). And even
GUI programmers don't tend
to think of *reading files* in "push" terms.
So I'd classify parsers into pull vs. push on one dimension (where the
flow of control resides) and
event-based vs. tree-based on another, orthogonal, dimension (how much
information is communicated
at a time). It's certainly possible, BTW, to have a parser that is both
tree-based and push; think
of XML::Twig.
-------- End of forwarded message --------
|