Re: [xml-dev] Rick Jelliffe's article on XSLT 1.0 performance

Thanks Michael. And congrats on Saxon doing so well: Saxon HE being just as fast or faster than libxslt (ignoring JVM startup time) was one of the surprise results to me.

I agree that benchmarking the usual expected cases is just as important as benchmarking edge cases prone to blowouts. But I am not so sure why you think we cannot add the numbers up (or, rather, take the difference between each of the successive tests to estimate the times for each of the stages) given the engines all seem to be single-threaded for XSLT 1.0 stylesheets and all the engines run over completed DOMs rather than interleaved SAX events. I think it shows that for large documents, the cost of XML parsing is utterly dwarfed by the costs of the in-memory data structures and algorithms used for processing.

One of my motivations was having experienced a company that switched to a technology other than XSLT for efficiency reasons, as a result of benchmarking XSLT against the vendor's ETL product. But I understand they used Xalan-J for benchmarking, and I suspected that the vendor had gamed the benchmarks to produce bad results.

I guess it is a fallacy of composition: Xalan-J does worse in our benchmarks than our product; Xalan-J is an XSLT engine; therefore all XSLT engines are worse in our benchmarks than our product. That is only correct of all XSLT engines have the same order of performance, and my little benchmarks demonstrate that is not so: so much so that one cannot evaluate "XSLT" performance, we can only evaluate particular engines.

(The same evaluation process also claimed that it was impossible to put in rules for unexpected paths into XSLT, as if wildcards did not exist! Maybe there was more to it than was transmitted to me, but on the face of it, it is rubbish.)

If Saxon is relatively weak at compilation time, why did you drop the pre-compiled stylesheet capability?

Regards

Rick

On Sun, Feb 12, 2017 at 6:22 AM, Michael Kay <mike@saxonica.com> wrote:

Rick Jelliffe has written an interesting article on XSLT 1.0 comparative performance at

https://www.xml.com/articles/2017/01/26/revving-xslt-10-engines-are-they-all-same/

By and large I trust the results because they are in line with previous such exercises including the measurements we made for our XML London paper in 2014.

However, I think there is one important caveat missing from the article, namely that what is being measured is the sum of (stylesheet compilation, tree building, transformation, and serialization) costs, and in many real workloads, it's unrealistic to simply add these numbers up.

Our own measurements showed Saxon being relatively strong on transformation time and relatively weak on compilation time, so we don't show ourselves to best advantage in a benchmark where the stylesheet is being compiled every time it is run. Of course in some workloads that is exactly what happens so we are starting to put more effort into compilation time.

Michael Kay
Saxonica
_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php