OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Does SAX make sense?

[ Lists Home | Date Index | Thread Index ]

On Sat, Apr 19, 2003 at 05:38:42PM +1000, Rick Jelliffe wrote:
> I am idly wondering whether unpooled steaming Java APIs of XML documents (e.g. SAX)
> really make as much sense as we might like them to.

  Personally I think SAX makes a terribly bad API for programmers,
forcing then to work with callback, i.e. loosing the thread of control
dusing parsing, and making hard to actually gather data from just
an event stream. SAX is too low level for most use, through it certainly
has its use as a lower API to implement the lower level services. But
those layers are not where the typical programmer should meet the XML data.

> It strikes me that there are two factors that undermine the benefits of streaming processing:
> * XML documents are rarely smaller than memory

  That doesn't reflect the use cases I see. Most of the XML documents on
my machine are less than a megabyte, maybe you meant "larger than memory",
right ?

> * Java implementations typically only garbage collect when they get "near" 
>   to filling their heaps.

  One more point to the Python camp and reference counting ...

> That being the case, it seems that simple streaming such as SAX provides
> don't make sense.  They would be better to either
> * have the SAX stream kept cached for the lifetime of the document
> (or have some kind of weak reference perhaps) since they are in memory 
> anyway (though unreachable), allowing backward-looking XPaths; or
> * requiring SAX clients return events to a pool (which would reduce 
> memory use).
> Does that sound right to anyone?

  No the event model is just too complex anyway. If you stream you
must have a good reason for it (usually memory requirement), if you
don't need it, use a more convenient model, if you need to stream
there are better interfaces like the XmlReader interface from
C# or others. Low memory requirement should not imply having to
work in callback mode. Streaming should not imply having to work
with very low level API [1].


[1] shameless plug http://xmlsoft.org/xmlreader.html#Mixing about mixing
    DOM/XPath and streaming in the upcoming libxml2 releases.

Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS