Lists Home |
Date Index |
> Daniel Veillard writes:
> > Personally I think SAX makes a terribly bad API for programmers,
> > forcing then to work with callback, i.e. loosing the thread of
> > control dusing parsing, and making hard to actually gather data
> > from just an event stream. SAX is too low level for most use,
> > through it certainly has its use as a lower API to implement the
> > lower level services. But those layers are not where the typical
> > programmer should meet the XML data.
> I agree, for the most part. SAX is a terribly difficult way for
> programmers to deal with XML, especially if they are already
> struggling just to get their minds around XML itself.
Depends on the programmer. Also, many RAD environments are
based on event processing- i.e. the programmer writes event handlers.
That is exactly the same as SAX. For instance, SAX has been ported
to Delphi, and for the point and click programmer's convenience,
drag and drop SAX components are provided. That makes SAX processing
quite "normal" for most Delphi programmers.
> The problem is that they often hit a point where they have no choice
> -- what works fine in the small, proof-of-concept implementation
> suddenly falls to pieces under real-world-scale load testing in a busy
> server. That's why we see so many programmers posting to xml-dev
> about SAX mid- to late-project, rather than at the during planning --
> they've come to the point where learning SAX is less painful than
> spending another month of late nights and weekends trying to figure
> out some other way to make their server handle more than a couple of
> hits per second.
> In summary, then, SAX is the root canal of XML programming -- you try
> less painful solutions first, but if nothing works, it's time to lie
> back in the chair, close your eyes, and open wide, to avoid losing
> your teeth altogether.
It's not so bad, really.
In my own projects I have used the approach to transform from the
"flat" event stream provided by SAX to an event stream fired by
a sort of schema object tree (based on the DTD), where each type
of element has its own callback. That goes some way to help
me manage callback context more easily.
> That aside, the other strength of SAX comes when you're going to be
> building an in-memory tree anyway, but it does not happen to be a DOM
> tree. Using the low-level events from SAX to build (say) a tree of
> geographical coordinates is much more efficient than building a DOM
> tree, then building a second tree of geographical coordinates from
> that, then garbage-collecting the DOM tree. In that case, most
> programmers in the project will never see SAX -- the load/save library
> should hide it completely.
In my personal experience that happens quite often.
The app's object model is not always the same as the DOM.
As a middle of the way approach it might be nice to have
a PULL API where one has the choice - as one encounters
the nodes of the document - to "keep" the ones that
one would like to use later, IOW, building a DOM tree as
a side effect of pull parsing.