OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Handling very large instance docs

[ Lists Home | Date Index | Thread Index ]



> >At the very least I need to be able to sequentially process a large
> >document and extract an identified sub-tree (ideally denoted by an
> >XPath expression) for run-of-the-mill tools to manipulate. I assume
> >such a beast would need to be based on a SAX parser.
> 
> I did exactly that in Python.  I considered building an engine that 
> could filter SAX events to those that match a limited version of 
> XPath, but ran out of gas.  I ended up with a just regular SAX 
> application.

Interesting - I always thought such a thing is useful, but haven't
come across implementation.

I built something like that in Delphi (I call it SAXPath)
on top of SAX. First you define an array of records (structs)
each with a name (or wildcard) - like XPath - and a call-back
interface pointer (used for filtering/predicates or processing).
I call the array elements "path nodes", and the array "path handler".
Then you register such an array with a "handler manager" for processing.
Only relative paths are currently supported.

Call-backs are done on every node of such a "path handler"
as long as it matches and as long as filter-call-backs 
further up haven't de-activated the "path handler".

For the projects I am involved in this has proven very practical.

Karl





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS