[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [xml-dev] SaxXPathFragmentFilter - Reduse large DOM trees using a SAX XPath cutter!
And if you do want to have actual XPath-esque support, I heartily
suggest taking a look at SAXPath to do your xpath parsing for you.
On Wed, 28 Nov 2001, PaulT wrote:
> I really like what you've done, but the language you're
> using is not XPath ( neither it is a subset of XPath )
> and I see a problem here ( I think I also have some
> kind of solution to that problem and I'l express it
> in my next letter )
> ----- Original Message -----
> From: "Niels Peter Strandberg" <firstname.lastname@example.org>
> To: <email@example.com>
> Sent: Wednesday, November 28, 2001 5:40 AM
> Subject: [xml-dev] SaxXPathFragmentFilter - Reduse large DOM trees using a
> SAX XPath cutter!
> > I have made an experimental SAX XMLFilter. It allows you to "filter" out
> > the information in an xml document that you want to work with - using
> > xpath - and skip the rest. You can place the filter anywhere in your
> > application where a XMLFilter can be used.
> > - I don't know if this has already been done by others?
> > The whole idea is to "filter" out the fragments from the xml document
> > that you specifies using an xpath expression. ex.
> > SaxXPathFragmentFilter(saxparser, "/cellphone/*/model[@id='1234']",
> > "result"). Build a dom tree from the result, or why not feed the sax
> > event into a xslt transformer and do some xslt transformations.
> > The big win is that you don't have to build a large dom tree, if you
> > only needs part of the information in a large xml document. You just
> > specify what fragments you want using xpath and the result will be a
> > much smaller dom tree, witch requires less processing, memory etc.
> > Let us say that you have a large document with spare parts to Volvo
> > vehicles. You want to do a list of engine parts for the S80 car model.
> > What you do is specify the xpath (locationpath) that you want to cut out
> > from the document ex. "/catalog/cars/s70/parts/engine".
> > // your sax parser here
> > XMLReader parser =
> > XMLReaderFactory.createXMLReader(
> > "org.apache.xerces.parsers.SAXParser");
> > // Get instances of your handlers
> > SAXHandler jdomsaxhandler = new SAXHandler();
> > String xpath = "/catalog/cars/s70/parts/engine";
> > String rootName = "s70engineparts"; // this will be the new
> > root.
> > // set SaxXPathFragmentFilter
> > SaxXPathFragmentFilter xpathfilter =
> > new SaxXPathFragmentFilter(parser, xpath,
> > resultrootname);
> > xpathfilter.setContentHandler(jdomsaxhandler);
> > // Parse the document
> > xpathfilter.parse(uri);
> > // get the Document
> > Document doc = jdomsaxhandler.getDocument();
> > This SaxXPathFragmentFilter is pure experimental. It is spaghetti code.
> > I just sat down with an idea and started to code, and the code is not
> > very pretty. It needs to be rewritten.
> > The xpath support is very limited for now. Here is the xpath you can do
> > today with this filter:
> > "/a/b" - An absolute path.
> > "/a/*/c" - An absolute path but where element no 2 "*" could be
> > any element.
> > "/a/*/c[@att='value']" - If element c has an attribute with 'value'.
> > "/a/*/c[contains='value']" - If element c first child node is a
> > text node that contains 'value'.
> > "/a/*/c[starts-with='value']" - If element c first child node is a
> > text node that starts with 'value'.
> > "/a/*/c[ends-with='value']" - If element c first child node is a
> > text node that ends with 'value'.
> > "/a/*/c['value']" - If element c first child node is a text node
> > that is 'value'.
> > "/a/*/c[is='value']" - As above.
> > As you can see the xpath options is very limited. But I think that when
> > I find a way to implement the "//" pattern, the filter will be even more
> > powerful.
> > I have problems with building a dom tree from the result using xerces
> > and saxon. But with jdom it works great. This needs to be fixed.
> > You can not rely on that the result is allways correct, so don't use
> > this in any application, just use if for expermentation.
> > You can find the code at:
> > tar.gz
> > My goal with this filter is to keep it realiable, simple, fast and
> > clean. If you want to contribute to this project, then you will be
> > wellcome. The filter will be realeased under som kind of opensource
> > license (if we get that fare!).
> > Test it an give me some feedback, on what you think.
> > Regards, Niels Peter Strandberg
> > -----------------------------------------------------------------
> > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> > initiative of OASIS <http://www.oasis-open.org>
> > The list archives are at http://lists.xml.org/archives/xml-dev/
> > To subscribe or unsubscribe from this list use the subscription
> > manager: <http://lists.xml.org/ob/adm.pl>
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>