[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Against the Grain: Pascal commentary about XML and databases
- From: Joshua Allen <joshuaa@microsoft.com>
- To: Mike.Champion@SoftwareAG-USA.com
- Date: Thu, 28 Jun 2001 16:01:55 -0700
>I keep hoping that there is some middle ground where the rigorous
mathematics of the
>relational model and the pragmatic usability of XML can meet and inform
one another. In
>private correspondence, Mr. Pascal assured me that a truly mathematical
model of XML is
>impossible, but I'm keeping an open mind.
Hehe, this is pretty good reading. The only reason that RDBMS software
dominates the market right now is because we are good at solving these
problems, and RDBMS design has evolved to disallow users from asking
questions that the database isn't good at answering. The fact that we
ship databases that only permit things that we know how to answer
efficiently does NOT imply that we will never be able to answer other
questions more efficiently (in fact, RDBMS systems have evolved and
gobbled up much of the research on data warehousing to include those
techniques into the engines -- witness materialized views and bitmapped
indexes). It is quite easy to see a trend in the industry that shows
consistent continual progress at solving hard query problems. Of course
some problems will always be hard (distributed cost-based query
optimization is one), but I would point out that research on RDBMS
optimizations has tapered off quite a bit and we have seen major
increases in research geared towards semi-structured data in the past
decade. So we are simply easing off on some of the traditional RDBMS
constraints and beginning to allow things like recursive self-joins,
ragged hierarchies, etc. and we are optimizing these things. I mean, we
already solved the RDBMs optimization challenge (and remember that there
were people predicting that SQL would never fly back in 1980) and now it
is time to move to the next thing. XML seems like a very appropriate
evolutionary step.
As for saying that a truly mathematical model of XML is impossible; XML
is simply a node-labeled graph. This is about as pure a discrete
mathematics concept as you can get. It is easy to find graph traversal
challenges that are NP-hard or need O(n^2) or worse. So? I think that
areas of discrete mathematics that deal with graphs are currently the
most vibrant area of research in the industry. The web itself is one
huge graph structure, and research on ways to index the web, optimize
routing, etc. all feed directly into techniques for optimizing XML
processing. And it seems that TSPs and NP-Optimizations are all the
rage these days. XML *is* math, and it's the *cool* math these days.
Data processing married with XML is about as real as it gets.
But I know this is all twice-told tale for you Mike.
Regards,
Joshua