OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

XQuery: is FLWR a <xsl:foreach/> ?

At 06:38 AM 2/23/2001 -0800, Evan Lenz wrote:

>Jonathan Robie wrote:
> > To a database person, it is somewhat surprising that your
> > paper does not explicitly mention joins, which are one of
> > the biggest reasons for FLWR expressions in XQuery. Joins
> > are central to database functionality, and it is important
> > to express them in a way that allows optimization based on
> > patterns detected in the expressions. I also notice that the
> > examples in your paper do not include any examples from
> > Section 3 of the XQuery paper, which shows how conventional
> > SQL-like queries are done.
>That's because Section 3 does not introduce any new query functionality.
>Using joins over an XML view of a relational database is just another use
>of the FLWR expression. The XSLT mappings to these are just as determinate
>as the rest.

I think this is a central difference between our views. One of the reasons 
for FLWR expressions is that there is an extensive literature on optimizing 
these kinds of expressions in SQL, OQL, and various tree-structured 
languages that are related to these first two languages. Although it may be 
possible to perform similar optimizations on XSLT, this clearly falls in 
the "future research" category. A fundamentally important issue: how does a 
query optimizer recognize patterns in the query, correlate them with 
information about the schema and the data, and rewrite the query in ways 
that can be proven to be equivalent and perform much faster?

For instance, suppose I have the following XQuery:

FOR $i IN //invoice,
         $p IN distinct($i//product)
WHERE $i/customer = "ACME",
              $p/name = "screwdriver"
             $p, $i/date

The query optimizer should be able to see that the WHERE conditions can be 
lifted up into the XPath:

FOR $i IN //invoice[customer="ACME"][.//product/name="screwdriver"],
         $p IN distinct($i//product)
             $p, $i/date

Now the query optimizer can look to see whether a datastore has an index on 
customer or on product name. Perhaps the indexes also have the quantities 
of the items. If there are tens of thousands of invoices for ACME, but only 
one invoice for a screwdriver, then it will act differently than it would 
if there were tends of thousands of invoices for screwdrivers, but only one 
for ACME.

At any rate, my own knowledge of query optimization is not deep, so I don't 
want to play the expert here. Guido Moerkotte has written an excellent 
survey on query optimization techniques which you can access here:


If you want us to use XSLT syntax directly in favor of our FLWR 
expressions, I need to know the answers to questions like these:

1. What are the equivalences that can be exploited for query optimization?
2. What are the typing rules for the possible <xsl:foreach/> constructs?
3. How are the various possible <xsl:foreach/> constructs translated into 
SQL? (fill in your favorite environment in place of SQL)

Do you know of any good work in these areas? Please don't ask me to do it 
myself, or ask for proof that it can not be done. If we want a solution in 
a reasonable amount of time, we should build on work that exists.


There are also aspects of XQuery optimization that fall solidly into the 
"future research" category.

These are my opinions right now. They may be quite different from the 
opinions of Software AG, the W3C XML Query Working Group, or the opinions 
that I will have after reading and considering your response.