OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] XPath/XSLT 2.0 concerns

[ Lists Home | Date Index | Thread Index ]

I think there's a lot of room for clever and creative optimizations for XSLT 
that would have a real impact. Using Schema information at 
stylesheet-compile time to generate a custom tree-builder (instead of using 
an off the shelf DOM) for this stylesheet could possibly save a lot of 
processing time and memory, even for "//*/foo" targets.

However, it is actual work to implement something like this. And anyone who 
wants to build such a thing could also go through the trouble of building a 
DTD/W3CXMLSchema/RNG interpreter on their own. There's no need for strongly 
typed xpath.

-Wayne Steele

>From: Jeni Tennison <jeni@jenitennison.com>
>Reply-To: Jeni Tennison <jeni@jenitennison.com>
>To: Robin Berjon <robin.berjon@expway.fr>
>CC: Paul Prescod <paul@prescod.net>, xml-dev@lists.xml.org
>Subject: Re: [xml-dev] XPath/XSLT 2.0 concerns
>Date: Wed, 2 Oct 2002 17:21:33 +0100
>Hi Robin,
> > I don't believe that either and I'd add that it takes a pretty
> > narrow view on XML but I can in fact see use cases for having access
> > to types in XPath. For instance when I see an XSLT processor chew
> > for several minutes on a very predictable document (granted, it's
> > Java based, but still) I think that if it had access to schema
> > information it could optimize a lot of what it's doing by skipping
> > entire subtrees.
>I know that's something that people claim quite a lot, but I don't
>think that it's at all easy for an implementation to carry out that
>level of optimisation, and I'm skeptical about whether you would
>actually get the speed-up you're looking for.
>Unless you've got really complicated stylesheets, a large proportion
>of the time spent by an XSLT processor will be on parsing and building
>up the node tree, especially if the document is so large that it has
>to start swapping in order to find enough memory to store it. Having a
>schema available will not help at this level.
>[If this is what's causing the slow-down (you should be able to tell
>from the timing information your processor gives you) I think that a
>better approach is to plug a SAXFilter into your pipeline that does
>the filtering out of the subtrees that you're not interested in.]
>Then, as with all these kinds of optimisations, there's the question
>of whether the time taken to perform the inferencing required to do
>the optimisation is actually less than the time it's currently taking
>to do the processing. I'd argue that in a well-designed stylesheet
>(one that didn't apply templates to or otherwise visit the nodes in
>the subtrees you want to ignore), the optimisation won't gain you
>much, if anything. And it might bring you additional problems, such as
>the famous optimising-away of tests that can't possibly be true
>according to the schema.
> > My issue here is that typing should be an option, available to those
> > that want it but not enforced upon others. XML Schema has too many
> > issues to be enforced upon anyone wishing to implement simple XPath.
>I do agree with that. Choice between tools and technologies is a good
>Jeni Tennison

Send and receive Hotmail on your mobile device: http://mobile.msn.com


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS