[
Lists Home |
Date Index |
Thread Index
]
David,
Thanks for your answer.
> Unfortunately, the ones who do not call in the
> consultants simply conclude that XML is too slow and abandon it
> completely.
I find this REALLY, REALLY unfortunate. Here we are,
back in the early days of SQL, where people didn't believe that they
can get
decent performance unless they hard code their files/indexes management
by hand,
or in the early days of Java, where you couldn't get a decent
performance
unless you hard code your memory management by hand..... that's sad...
How is that some people cannot trust that XML/XSLT/XQuery performance
WILL come?
Performance always comes when there is a need for it.
Moreover, even if the performance would NOT be comparable (which I
doubt anyway but...),
the difference in productivity is SO big..... Do you know how many
servers can be bought with
the equivalent work of those XML consultants ?? Not talking about the
fact that those hard
coded solutions will break with every single change (e..g. think of
schema evolution:
what was streamable yesterday is not streamable today, so here you go,
call back
your XML consultants and rewrite your SAX application...)
> So far, no one has shown me a DOM, XSLT, or XQuery-based
> app that is not at least an order of magnitude or two slower than a
> hand-rolled streaming application, and that's not even considering the
> memory overhead.
This I don't believe so easily. I worked for two companies recently
(BEA and
Oracle) and they both have very, very decent XQuery implementations.
In BEA we put all our efforts into maximizing streaming, exactly to
solve the use
cases you are talking about: thousands of transformations per second
per server.
As long as I was there I didn't hear too many complaints about the
performance of the
XQuery engine.
So I have a hard time to believe that the XQuery's performance is the
big problem, or
that performance will remain a big problem for a long time.
> because (as you suggest) SAX and STAX are low-level APIs. Coming up
> with commonly-accepted streaming subsets of XPath or XQuery might give
> the best of both worlds: fast prototyping, as with XSLT or XQuery,
> *and* decent performance in a real, production-grade system.
Here I am lost again. Why do we need subsetting ?
There is a sort of a myth that unless you consider a *subset* of
XQuery/XSLT there cannot be
good performance/streaming.
This myth is strange. Of course, not full XQuery can be executed with
zero memory consumption.
But please explain: why is this an issue ?
Just use full XQuery, and leave the task of minimizing the memory
consumption
to the XQuery implementors, and if they can execute your queries with
no memory consumption,
they'll do it; otherwise, they'll just use the minimum amount of memory
they need for the given computation.
If they'll need to automatically rewrite your query into a equivalent
one that enables more
streaming, they'll also do it (e.g. rewriting a backwards navigation
into forward one).
There are hundreds of possible optimizations to enable streaming and
increase performance.
Why do you think we need XQuery/XSLT subsetting ?
Best regards,
Dana
|