Re: Is DOM more efficient than SAX for smaller xml instances?

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: Tatu Saloranta <cowtowncoder@yahoo.com>
To: xml dev <xml-dev@lists.xml.org>
Date: Thu, 24 Aug 2006 09:54:34 -0700 (PDT)

--- Frans Englich <frans.englich@telia.com> wrote:

> On Thursday 24 August 2006 13:41, bryan rasmussen
> wrote:
> > I seem to remember once, long ago, reading
> something showing DOM was
> > more efficient than SAX for small XML instances -
> small being approx
> > 40 kb.
> 
> It depends on what you do with it, if you ask me.
> 
> DOM does essentially what would SAX do(parse XML
> into events) but from there 
> continues to build a tree, typically. Obviously,

Conceptually, yes, but in practice usually DOM
implementation is layered on top of a SAX
implementation (or another streaming sub-system,
StAX).
From this, typical performance implications should be
clear: for pure parsing (and no business logic -- the
trivial case), SAX implementations will usually be
faster than DOM implementations. Even for small
documents.

However, with small documents the basic initialization
overhead of the parser setup (even when recycling
reader instances, where possible) is most significant,
so performance between implementations vary widely.
It's easy to be 10x faster than Xerces SAX, for
example, for document like "<a/>"; whereas getting
more than 20% faster for a megabyte document is much
more challenging.

This also means that for small documents, speed
difference between DOM and SAX implementations (in
parsing) is mostly determined by the initialization
overhead. That's why Xerces SAX and Xerces DOM may
have close to similar speed for tiny documents, even
though latter is built on top of former.

...
> So it all depends on what you do. If you take SAX
> events and builds a custom 
> structure that consumes more memory and/or is slower
> to navigate than a DOM 
> tree, it would be less efficient than using DOM..

Sure. However, due to burden DOM model itself imposes
on implementations (live views to node sets when
changed etc), most people create slightly less heavy
weight tree models. But the difference in performance
is usually not big: all full-tree models still need to
access all the information needed for the tree, and
build some kind of structure.

-+ Tatu +-

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com

References:
- Re: Is DOM more efficient than SAX for smaller xml instances?
  - From: Frans Englich <frans.englich@telia.com>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]