[
Lists Home |
Date Index |
Thread Index
]
• Daniela Florescu/color>, Chris Hillery, Donald
Kossmann, Paul Lucas, Fabio Riccardi, Till Westmann, Michael J. Carey, Arvind
Sundararajan: The BEA streaming XQuery processor. /bigger>/bigger>/fontfamily>VLDB Journal Volume 13, Number 3,
September 2004/x-tad-bigger>/bigger>/fontfamily>
An
interesting paper.
(Funny how everyone uses Xalan-J as their baseline for
performance comparisons - I wonder why?)
In the context of streaming, most of the techniques
described are not very different from those used in the better XSLT processors.
One big difference is that XSLT 1.0 is typically implemented using a dual
push/pull model: XSLT instructions use a push pipeline to write a tree, while
XPath expressions use a pull streaming model to read data from trees; whereas
this paper describes a model that uses pull iterators uniformly. If you extend
this all the way to using a pull parser to read the incoming XML data in the
first place (and a pull-based streaming validator), then you do indeed get a
system that avoids the need to construct the input document in memory, in the
special (and probably rather unusual) case where all operations in the query
have a fully streamed implementation.
(Note, however, that the push approach avoids the need to
build the *result* document in memory, and in classic stylesheet applications,
the result document is generally larger than the source
document)
The conclusion of the paper is less than impressive "The
running times can be improved... 3.8 MB is much larger that what the
implementation of the engine was tuned for..." I'm seeing users doing XSLT
transformations up to 200Mb, despite the limitation that the source document has
to fit in memory! But nevertheless, the architecture looks very solid, and
congratulations to BEA for publishing it, unlike vendors of "high-performance"
XSLT engines who make marketing claims but give us no technical information
to enable an informed assessment or comparison.
Michael Kay
http://www.saxonica.com/
|