[
Lists Home |
Date Index |
Thread Index
]
* David Megginson <david.megginson@gmail.com> [2004-12-31 13:15]:
> On Fri, 31 Dec 2004 11:57:44 -0500, Alan Gutierrez
> <alan-xml-dev@engrm.com> wrote:
>
> > It tells me that SAX, as an API, is missing support for parallel
> > processing. Parellel porcessing is possible in SAX, or rather,
> > it is possible with ContentHandler implementations that were
> > co-operative and synchorized, but then you're forgetting an
> > interface to specify ContentHandlers, and a bunch of other whatnot.
>
> I think that there's been a terrible amount of confusion on this thread.
>
> As soon as the filter pipeline design pattern started to become
> popular for applications based on SAX1, I assumed that people would
> insert tee-joints in the pipeline to allow for parallel processing
> when they needed it, especially since thread management in Java is so
> easy. Note that we're talking about parallel threads performing
> different operations on the *same* sequential event stream (i.e. one
> thread might be populating a database, while another is producing an
> HTML page).
> I can also conceive of applications where different threads deal
> with different parts of the document, as long as the source event
> stream stays single-thread and sequential -- for example, a filter
> might divert a series of events representing a document subtree to
> a separate thread that builds a data structure and performs
> time-consuming operations while the rest of the event stream
> continues on to the original thread (which was briefly suspended
> waiting for more input).
> So the source of the SAX event stream has to be sequential, but
> there's no reason that the rest of the filter pipeline cannot be
> parallelized.
In my SAX Strategy library, the direction has gone toward making
the Strategies stateless, and placing information in a stack
that is maintianed by a SAX ContentHandler/LexicalHandler. This
lends itself to parallel processing, and I figured it was only a
matter of time before I was preforming multiplexing, updating an
Oracle database in one thread, an LDAP directory in another.
I've yet to sort out how to bring the stream of events back together.
From the viewpoint of SAX, I see XML less as a document, more as
series of events, so I've chosen to, more or less, dispose of
the startDocument and endDocument events in my library. There
are their, but they are second-class citizens, and logic
generally begins with the first element event.
Indeed, do parallel processing of the sort your describing. I've
often multiplexed a SAX event stream to do two things at once,
but withing the same thread.
For some reason, perhaps misguided, I see a lot of potential for
SAX and streaming XML.
--
Alan Gutierrez - alan@engrm.com
|