OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [xml-dev] DOM or SAX: Sense and Sensibility

Just to include some data-points...

I think that in the open-source Java world, focus has been more on the
infoset than on any given object-model.  Since we have JDOM, dom4j, EXML,
along with normal DOM, and only certain utilities are supported under
certain models (ie, Xalan won't work with dom4j Documents directly),
there's been a lot of work on translating one model to another.

Then, you have things like dom4j's ElementHandler interfaces, which
allow folks who are used to processing object trees deal appropriately
with very large datasets.  You can register a handler to match particular
subtrees.  Do whatever processing you need (including XPath expressions),
and then detach the sub-tree, freeing up memory for the rest of the

Then, you have Jaxen (caveat: it's my project, so I'm baised), which
aimed to be object-model independant for XPath evaluations, and instead
works through a Navigator interface, which aims to provide homogeneous
access to InfoSet members regardless of the model being used.

In my experience, it's not just DOM vs SAX, but competition between
the DOMs (sometimes mixing several in the same application) and SAX.
And typically, dom4j's sub-tree mechnisms have keep me from having to
venture into hard-to-maintain SAX code.


On Wed, 7 Nov 2001, Bullard, Claude L (Len) wrote:

> I keep hearing that last bit too.  "I'll pass you a 
> DOM; it gets a DOM" and so forth.   In one room, 
> I tried to explain what the infoSet was and why one 
> might want to know about it and the programmer replied, 
> "I don't want to know that.  Useless.  Show me the code." 
> The scary thing is this stuff is coming out of interface 
> experts.
> Are the XML abstractions that hard to learn?  To me, 
> regardless of other issues, the infoSet description 
> has always been an easy read and explained how the 
> structural abstractions made the meta-ness tractable 
> in the APIs.  I'm not sure it is all that easy to 
> understand XML without it, but I'm poisoned fish 
> anyway after two decades of markup work.
> len
> -----Original Message-----
> From: david.hunter@mobileq.com [mailto:david.hunter@mobileq.com]
> Without any hard data, here are some reasons I suspect come into play when
> people choose the DOM over SAX, when SAX might be more appropriate:
> 1)  A lot of programmers are not really used to event-based programming, as
> used by SAX.  They're more comfortable working with an in-memory object
> model than in keeping track of context as events are passed in, etc.
> 2)  Unfortunately, programmers [including myself] often test their code with
> much smaller data sets than they can expect in real life.  One of the
> reasons one might choose SAX over the DOM is that one is working with very
> large XML documents, but if one is not testing one's code with large
> documents, one might not realize one needs SAX until it's too late...
> 3)  For COM programmers, who use Microsoft technologies extensively:  MSXML
> had [some] DOM support long before it had SAX support.  I think a lot of
> programmers, who aren't XML gurus, just got used to the fact that "this is
> the way you work with XML".  I've even heard people referring to an XML
> document as "a DOM", which shows how deep the confusion can go.
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>