OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Future of Databases

[ Lists Home | Date Index | Thread Index ]

Ken North wrote:
> Perhaps we'll hear from Ron Bourret. It was his question whether XQuery
> would displace SQL.

I've summarized my notes below.

DISCLAIMER: Everything is paraphrased, not quoted. Also, I'm not a
trained journalist -- that is, I can't think and take notes at the same
time -- so if something strikes you as particularly astonishing, it
probably wasn't and I'm to blame. Same goes with answers that are merely
incomprehensible. Hopefully Ken can correct me as necessary. (And my
apologies to Edd Dumbill for not writing this up as an xmlhack article.)

OVERVIEW:

Basically, my take on the panel was that it summarized what is current
wisdom in the XML/database world, especially as seen through the eyes of
people with strong relational backgrounds. In particular:

1) Relational databases aren't going away.
2) XML is/will become the dominant exchange format, with XML front ends
to everything.

As the InfoWorld article said, the one area of controversy was the role
of XML databases. Rick Cattell saw them as niche players (although
relational databases will support native XML) while Daniela Florescu saw
them in a much larger role.

QUESTIONS AND ANSWERS:

Q. SQL is a formal standard. Java and XML are not formal standards. Does
this matter?

Melton: Standards are important. Without them, you don't get
interoperability. What is less important is whether these are de jure or
de facto. Note that ANSI, ISO, and the W3C all work with each other now.
Also important is that de jure standards come in geologic time, not
Internet time, although this is changing with fast-track processes.

-------------------------------

Q. Is it possibile to do transaction processing over the Internet and
scale to millions?

Gray: Google and Hotmail do this. They have 10,000 processors
[servers?]. Google is replicable and Hotmail is partitionable, so
scaling really depends on the application. Traditional applications are
1000 transactions/second.

<aside speaker="Gray">The cost of computer management is greater than
capital cost, so we need self-managing computers.</aside>

-------------------------------

Q. XQuery is document-centric. What can we expect?

Chamberlin: Database people are still trying to catch up with the 90s,
when all computers were connected together. A laptop can "hold" all of
the works of mankind. Most are semi-structured or un-structured. Many
come from streaming sources. For the last eight years, the database
industry has been trying to solve this. [RPB: I believe this refers to
the various attempts by relational databases to store/query text, as
well as to add metadata to BLOBs, etc.]

Predictions:
1) XML is/will become the dominant format for data interchange. It is
flexible and self-describing.
2) Applications will want to query data in the same format they exchange
it -- that is, they want to view all sources as XML.
3) RDBMSs and SQL won't go away. They are good for homogenous data.
Instead, they will add XML front ends.
4) Other data sources will add XML front ends as well.
5) Large RDBMS vendors will make XML a first class citizen.

-------------------------------

Q. With document-centric XML [in the main?], will tools follow?

Chamberlin: Yes. It is still the early days. Updates, transactions,
indexing, etc. are not completely/yet addressed.

-------------------------------

Q. Lots of other industries have consolidated. Will the software
industry consolidate into a single monolithic corporation?

Cattell: We need multiple vendors to keep innovation alive. There are
only 3 1/2 database vendors left. At this level, de jure processes are
appropriate. JDBC and similar technologies [which are newer?] can move
faster.

[In answer to Chamberlin?] Very few people store/query XML -- the
momentum of relational databases is simply insurmountable. XML databases
will be a niche only, for use in caching, etc. There is a bigger market
for XML-to-relational translators.

-------------------------------

Q. E-commerce is exchanging XML. RDBMSs can process 50K rows/second
while parsers can only parser 2-3K XML documents per second. Is XML
optimizable?

Florescu: I am optimistic about XML databases. XQuery should be equally
optimizable [to SQL?]. There is are three impedance mismatches in Web
services: Web to XML, XML marshalled to Java, and Java marshalled to
RDBMS. There is therefore a market for a language that joins XML, Java,
and SQL.

-------------------------------

Q. What about peer-to-peer databases?

Chamberlin: If the data is easily replicated, doesn't belong to anyone,
and is read-only, then peer-to-peer databases are a good idea. Bank of
America is going to want more control over their data.

Gray: Napster is a good example, although they didn't own the data.
SETI@Home was one third of the bandwidth in the University of
California, which caused problems. The solution was to increase the size
of computational pieces, therefore reducing the overall bandwidth use.
Similarly, peer-to-peer sends lots of data around. The problem with this
is that sending data around is expensive, while local computing is
essentially free.

Cattell: There is a growth market for peer-to-peer databases. Knowledge
management is a growth area for non-relational databases.

-------------------------------

Q. What is the future of databases -- T-spaces, in-memory databases,
etc.?

Cattell: In-memory databases are a no-brainer. T-spaces are interesting.
They are a database, operating system, messaging system, etc. all rolled
into one, although they won't take over the world yet.

Gray: A variant of T-spaces is work flow. Flows can be described in XML
to dovetail together.

-------------------------------

Q. In the 90s it was fashionable to say that databases were dead, that
the Web exceeded databases. Will something replace databases?

Melton: Databases are increasingly needed. Storing traditional data is
solved. Storing text is solved. There are new problems, such as
searching video and audio. We need joins across different types [RPB:
e.g. email and video].

Chamberlin: There are lots of new challenges -- decades of Ph.D. work.
XML queries are structurally different from relational queries. XML data
is heterogenous. For example, "Find all the red stuff" returns a cherry,
a stop sign, etc. Data is ordered, which causes optimization problems.
You can ask questions about both data and metadata such as, "What kinds
of things are red?" There are new ways to deal with sparse data, which
requires lots of nulls in a relational database. XML handles this data.
The logic around nulls is different. XML databases need a different way
to construct things due to using a hierarchy.

Cattell: Anybody who says that databases are dead means that relational
databases are mature and you can't find a thesis topic.

-------------------------------

Q. How do we query the Web? Is the goal to query data or to describe it
so it can be queried?

Cattell: Data needs to be described before it can be queried.

Florescu: You can query XML data without a schema, due to its
self-describing nature. Vertical applications need schemas.

Melton: There are no silver bullets. The great majority of the world's
data has no metadata and probably won't ever have any. Only commercially
valuable data will get metadata. We are trying to find ways to query all
types of data.

Gray: There are lots of data sources. For example, LDAP has 7 mandatory
fields and 1000 optional fields. This fits XML well. Similar sources are
email, schedules, etc. XML will be the standard interchange language,
which encourages data sources to expose themselves as XML.

-------------------------------

Q. Are there context sensitive searches in XML, such as in the context
of the previous query?

Chamberlin: This is not addressed in XQuery.

Melton: You can use successive refinement against temporary results.

[RPB: I think somebody pointed out that XQuery is composable?]

-------------------------------

Q. What do you think of AMDs (associative model databases)?

[RPB: The panel didn't really understand the question. Neither did I. As
near as I could tell, an AMD is where you store all your data in, for
example, two tables. One contains individual data values and the other
contains information about associations/links between data values.]

Gray: We already have an associative data model: SQL and XQuery are
associative.

-------------------------------

Q. How will XQuery be used? That is, will distributed vs. local queries
affect optimization strategies?

Florescu: Yes. Local and distributed queries are optimized differently.
This will mean different implementations for different markets.

-------------------------------

Q. Relational databases have an algebra. Does XQuery have an algebra?
Will XML databases replace relational databases?

Melton: Relational databases have already integrated object technology.
Similarly, they will integrate XML technology. Yes, there is a formal
model for XQuery, but not as formal as the relational model.

Florescu: We will formalize the model in the W3C. It is not as elegant
as the relational model.

-------------------------------

Q. Will XML influence screen scrapers, etc.?

Gray: It should reduce the need to screen scrape. Note that there are no
eyeballs for XML. The customers are programs.

-------------------------------

Q. Will XQuery replace SQL as the application query language of choice?

Chamberlin: No.

Florescu: Not in five years, but in ten years an extension of XQuery
might replace both.

Melton: Don't underestimate relational databases.

Cattell: No. Another language might replace them. For example, something
Google-like.

-- Ron




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS