Lists Home |
Date Index |
> >At the very least I need to be able to sequentially process a large
> >document and extract an identified sub-tree (ideally denoted by an
> >XPath expression) for run-of-the-mill tools to manipulate. I assume
> >such a beast would need to be based on a SAX parser.
> I did exactly that in Python. I considered building an engine that
> could filter SAX events to those that match a limited version of
> XPath, but ran out of gas. I ended up with a just regular SAX
Interesting - I always thought such a thing is useful, but haven't
come across implementation.
I built something like that in Delphi (I call it SAXPath)
on top of SAX. First you define an array of records (structs)
each with a name (or wildcard) - like XPath - and a call-back
interface pointer (used for filtering/predicates or processing).
I call the array elements "path nodes", and the array "path handler".
Then you register such an array with a "handler manager" for processing.
Only relative paths are currently supported.
Call-backs are done on every node of such a "path handler"
as long as it matches and as long as filter-call-backs
further up haven't de-activated the "path handler".
For the projects I am involved in this has proven very practical.