OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Streaming XML (WAS: More on taming SAX (was Re: [xml-dev]

[ Lists Home | Date Index | Thread Index ]

David,

Thanks for your answer.

>  Unfortunately, the ones who do not call in the
> consultants simply conclude that XML is too slow and abandon it
> completely.

I find this REALLY, REALLY unfortunate. Here we are,
back in the early days of SQL, where people didn't believe that they 
can get
decent performance unless they hard code their files/indexes management 
by hand,
or in the early days of Java, where you couldn't get a decent 
performance
unless you hard code your memory management by hand..... that's sad...

How is that  some people cannot trust that XML/XSLT/XQuery performance 
WILL come?
Performance always comes when there is a need for it.

Moreover, even if the performance would NOT be comparable (which I 
doubt anyway but...),
the difference in  productivity is SO big..... Do you know how many 
servers can be bought with
the equivalent work of those XML consultants ?? Not talking about the 
fact that those hard
coded solutions will break with every single change (e..g. think of 
schema evolution:
what was streamable yesterday is not streamable today, so here you go, 
call back
your XML consultants and rewrite your SAX application...)

> So far, no one has shown me a DOM, XSLT, or XQuery-based
> app that is not at least an order of magnitude or two slower than a
> hand-rolled streaming application, and that's not even considering the
> memory overhead.

This I don't believe so easily. I worked for two companies recently 
(BEA and
Oracle) and they both have very, very decent XQuery implementations.

In BEA we put all our efforts into maximizing streaming, exactly to 
solve the use
cases you are talking about: thousands of transformations per second 
per server.
As long as I was there I didn't hear too many complaints about the 
performance of the
XQuery engine.

So I have a hard time to believe that the XQuery's performance is the 
big problem, or
that performance will remain a big problem for a long time.

> because (as you suggest) SAX and STAX are low-level APIs.  Coming up
> with commonly-accepted streaming subsets of XPath or XQuery might give
> the best of both worlds: fast prototyping, as with XSLT or XQuery,
> *and* decent performance in a real, production-grade system.

Here I am lost again. Why do we need subsetting ?

There is a sort of a myth that unless you consider a *subset* of 
XQuery/XSLT there cannot be
good performance/streaming.

This myth is strange. Of course, not full XQuery can be executed with 
zero memory consumption.
But please explain: why is this an issue ?

Just use full XQuery, and leave the task of minimizing the memory 
consumption
to the XQuery implementors, and if they can execute your queries with 
no memory consumption,
they'll do it; otherwise, they'll just use the minimum amount of memory 
they need for the given computation.
If they'll need to automatically rewrite your query into a equivalent 
one that enables more
streaming, they'll also do it (e.g. rewriting a backwards navigation 
into forward one).

There are hundreds of possible optimizations to enable streaming and 
increase performance.

Why do you think we need  XQuery/XSLT subsetting ?

Best regards,
Dana







 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS