OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   QLRE: [xml-dev] Streaming XML (WAS: More on taming SAX (was Re: [

[ Lists Home | Date Index | Thread Index ]
  • To: "Daniela Florescu" <dflorescu@mac.com>,"Karl Waclawek" <karl@waclawek.net>
  • Subject: QLRE: [xml-dev] Streaming XML (WAS: More on taming SAX (was Re: [xml-dev] ANN: Amara XML Toolkit 0.9.0))
  • From: "Arpan Desai" <arpande@microsoft.com>
  • Date: Mon, 27 Dec 2004 21:09:40 -0800
  • Cc: "XML Developers List" <xml-dev@lists.xml.org>
  • Thread-index: AcTshgOq3nicykctTeyn4YjjFdMtVQAAbo6g
  • Thread-topic: QLRE: [xml-dev] Streaming XML (WAS: More on taming SAX (was Re: [xml-dev] ANN: Amara XML Toolkit 0.9.0))

I think one feature a lot of people are looking for is processing
predictability; something SQL databases don't really offer, even today.

If I write a query, am I guaranteed the database is going to do the
right thing?  Nope.  If it does the right thing today, will it continue
to do the right thing tomorrow?  All I can do is cross my fingers and
hope.  If declarative languages/database statistics/intelligent
optimizers always do the right thing, why do the vast majority of
enterprise databases ship with some sort of query plan analyzer?  Why do
most of them enable the user to inject the explicit use of indices in
the query?  Because sometimes, maybe more often than most of us in the
database world would like to admit, they do the not-so-right thing.

When we were coming up with the subset of XPath 1.0 for the XPathReader,
previously referenced by Oleg Tkachenko on this thread, a core tenant we
put forth was the ability to _guarantee_ its performance
characteristics.  Intelligent decision-making factors like statistics
and histograms were simply unavailable to us, so we sought to ensure the
problem of doing the wrong thing simply couldn't arise.  By only
allowing a defined, streamable subset of XPath, we not only guaranteed
the execution characteristics; we were also able to effectively inform
the user, a priori, to actual execution.

Overall, I agree that powerful, declarative languages have solved a lot
of problems...and that the burden of performance should fall on the
implementers.  However, I also think there's a large segment of cases
aren't adequately handled by current query processors.  Two broad
categories I can think of right now are:
1. The developer/query writer knows the bounds of their specific
use-case to the extent where a general-purpose optimizer simply can't do
as good a job as something they tweak themselves, sometimes by orders of
2. The developer/query writer wants some sort of guarantee.  Whether
this is through some sort of governor mechanism or by designed
restriction of the processor, they want simply want some sort of
up-front assurance of performance.

One potential solution I can think of to solve both of these problems is
opening up different portions of the processing pipeline so that users
aren't forced to implement "non-elegant" solutions, such as explicitly
referencing an index in their SQL statement.  Instead, allow them to
alter the heuristics of the optimizer or even inject their own custom
logic during logical/physical plan generation.  On the other hand, I'm
sure many people (even/especially? those working in my building)
wouldn't necessarily characterize this solution as "elegant".  Heck, I
might even be in that boat :)

At the end of the day, I don't think there's a one-size-fits-all
solution.  Of course, who knows what tomorrow will bring.

Happy holidays,
Arpan Desai

-----Original Message-----
From: Daniela Florescu [mailto:dflorescu@mac.com] 
Sent: Monday, December 27, 2004 6:31 PM
To: Karl Waclawek
Cc: XML Developers List
Subject: Re: [xml-dev] Streaming XML (WAS: More on taming SAX (was Re:
[xml-dev] ANN: Amara XML Toolkit 0.9.0))

>  I was quite surprised
> to lean that some of my queries were significantly
> easier to understand and write using xBase (index based
> iterators, linked with other iterstors if necessary)
> than using SQL.

!? Com' on Karl.

If you add/remove an index you have to rewrite your xBase
application. If the volume of data or the frequency of queries
changes you gave to rewrite your application. Etc. Etc. Etc.

Do I have to say anything more !?

We are 30 years after Ted Codd, and many, many billions dollars
market later. I find it really strange that I need to argue in favor of 
  declarativity in 2004, almost 2005.

Best regards,

The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of OASIS <http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://www.oasis-open.org/mlmanage/index.php>


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS