OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] SAX and Pull options: was: Penance for misspent attributes

[ Lists Home | Date Index | Thread Index ]

 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> -----Original Message-----
> From: Dennis Sosnoski [mailto:dms@sosnoski.com] 
>
> There are a couple of points I'll comment on in this. The 
> first is that 
> SAX doesn't really function as an event based architecture
> component  because it's relying on the application to give it
> control in 
> the first 
> place - the application thread is what executes all the 
> parsing, as well 
> as the call-backs to the handler. This is implicit in the SAX 
> specification since it does not address any synchronization 
> issues that 
> would needed if different threads could be used.

That's not necessarily the case. SAXHandlers can be deployed into a
system and then dispatched to; who invokes the parser is moot. 

 
> The second is that the servlet architecture that forms the 
> basis of most 
> application servers is not really extensible to non-blocking IO.
> The  servlet model ties up a thread until all processing of a
> request is  completed, so you may as well have the thread just
> wait for 
> input if needed.

Having a synchronous application layer is not a good reason to stay
with a synchronous server layer. 

Thread per request is part and parcel of the Servlets specification
(and thus JSP also). One workaround is to fire events straight
through to a proxy which handles associated io buffering and
servlet invocation (you can think of the proxy as marking a process
boundary). One would hope to see a standard that invalidated thread
per request programming for machine to machine work particularly
for intermediary data processing and rewriting. XML Pipeline is an
option, as is ... SAX.
  

> >when I could have had a runtime binding based on the types of
> >the  visitor and visitee and internal iteration (presumably the
> >parser is  best placed to know the token type) via double
> >dispatch.
> >
> I think this kind of misses the point of using a pull parser. 

Then perhaps I'm not seeing the sweet spot for a pull parser API; I
think it's somewhere between SAX and DOM, but I'm wondering how
difficult is it really to manage state cleanly by building on top
of a well known API, versus learning a new API that may have
maintenance issues down the line. All said, I don't see the
simplicity win in XPP.


> Using some simple utility methods I can parse the data content of
> the  document very easily with direct inline code, rather than 
> having to use 
> a state machine. I think this is a much more natural style of 
> programming for most developers - a top-down structure in the 
> code that 
> reflects the structure of the document.

I do think it's natural; procedural programming often is. It's the
maintenance and life of the code that bothers me. A standard that
legitimizes switch blocks over polymorphism where polymorphism is
available is open to question. Once those typecodes are
standardized the only way for me to refactor them and get the code
under control is to add /another/ layer, that encapsulates
typecodes behind objects or as a first cut, typesafe enumerations.
As I said, the typecodes are an implementation detail spilling out
into the API; upgrading them to at least to tokens will save people
some hassle later on.


> I could wrap a pull parser in handlers to give the same 
> effect as a SAX 
> parser interface - in fact, Alek Slominski has actually
> implemented a  prototype SAX2 push layer on top of a pull parser 
> (http://www.extreme.indiana.edu/xgws/xsoap/xpp/). Trying to 
> turn a push 
> interface into a pull interface is much more difficult, basically
>  requiring a separate thread and associated threading overhead.

Well building async (SAX) atop sync (XMLPULL) is a bit odd. The
other way around can be dealt by inserting a queuing layer, this
way async and sync layers don't communicate directly, concerns
remain separated and concurrent. 

Bill de hÓra

-----BEGIN PGP SIGNATURE-----
Version: PGP 7.0.4

iQA/AwUBPOpTLeaWiFwg2CH4EQLTcgCfTWgp+ft6MzRGCVOhSY069a/4vlMAn0uR
xgtDm0ZiolYOKHfOkkMQEslx
=PJQi
-----END PGP SIGNATURE-----





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS