xml-dev - RE: [xml-dev] Does SAX make sense?

RE: [xml-dev] Does SAX make sense?

[ Lists Home | Date Index | Thread Index ]

To: "'Michael Brennan'" <mpbrennan@earthlink.net>,"'Rick Jelliffe'" <ricko@allette.com.au>
Subject: RE: [xml-dev] Does SAX make sense?
From: "Martin Soukup" <martin@dynamine.net>
Date: Sat, 19 Apr 2003 22:10:31 -0400
Cc: <xml-dev@lists.xml.org>
Importance: Normal
In-reply-to: <3EA1CDB2.4BE14B42@earthlink.net>


I totally agree Michael. I would like to see a mixed model which allows
XQuery filtering of events and similar behavior. Some layers based on
SAX filters seem to come close to this goal.
 
> 
> Rick Jelliffe wrote:
> >
> > I am idly wondering whether unpooled steaming Java APIs of XML
documents
> (e.g. SAX)
> > really make as much sense as we might like them to.
> 
> I've been wondering why we have such absolute either-or choices in
> available APIs. Why not hybridized APIs that provide event streams,
but
> let you collect arbitrary spans of content into an object model that
can
> be more easily manipulated and accessed without needing a complete
> in-memory tree model of the entire document?
> 
> I think both SAX and tree APIs are unweildy to work with. I'm more
> interested in rule-based and pattern-based approaches, but prefer not
to
> have to build a complete in-memory model of the entire document to
> enable such an approach.
> 
> >
> > It strikes me that there are two factors that undermine the benefits
of
> streaming processing:
> >
> > * XML documents are rarely smaller than memory
> > * Java implementations typically only garbage collect when they get
> "near"
> >   to filling their heaps.
> >
> > These two things conspire to make it that, for the lion's share of
> documents,
> > by the time the SAX stream is finished, all the SAX events will be
still
> > in memory, though perhaps unreachable. If they are in memory, why
not
> > make them available?
> >
> > That being the case, it seems that simple streaming such as SAX
provides
> > don't make sense.  They would be better to either
> >
> > * have the SAX stream kept cached for the lifetime of the document
> > (or have some kind of weak reference perhaps) since they are in
memory
> > anyway (though unreachable), allowing backward-looking XPaths; or
> 
> Pooling objects using weak references incurs a small performance
penalty
> (I've experimented a bit with such approaches, though not for SAX
> events). In the context of a real-world application this penalty is
> likely to be pretty minimal. Nonetheless, if someone is using SAX, it
> may be becaused they are trying to maximize performance.
> 
> >
> > * requiring SAX clients return events to a pool (which would reduce
> > memory use).
> >
> > Does that sound right to anyone?
> 
> The approach I'm experimenting with, right now, in my swan toolkit
> (http://swan.sourceforge.net) is maintaining a stack to support
> backward-looking XPaths and XSLT pattern-matching, melded with rules
> that can gather content into suitable data structures for relevant
> portions of a document. As part of that, I have a prefab rule one can
> use to gather up a fragment into a minimalistic tree API that supports
> XPath queries. This could easily be adapted to use a full-fledged tree
> API for the fragment, but I was more interested in using XPath
> expressions than navigating unweildy tree APIs.
> 
> This is still all in a rough state. I haven't done a file release of
> this code, yet, and some key portions are not in CVS, yet (due to some
> problems I've been having with CVS integration with Eclipse). I've
also
> been letting this languish the last few weeks, but am starting to get
> back into it this weekend. I've been approaching this in a rather lazy
> fashion (my motivation has been admittedly low), but I hope to have an
> alpha release of something soon.
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>

References:
- Re: [xml-dev] Does SAX make sense?
  - From: Michael Brennan <mpbrennan@earthlink.net>

Prev by Date: RE: [xml-dev] Does SAX make sense?
Next by Date: Re: [xml-dev] What is XML's appropriate place in an office suite?
Previous by thread: Re: [xml-dev] Does SAX make sense?
Next by thread: Re: [xml-dev] Does SAX make sense?
Index(es):
- Date
- Thread