xml-dev - RE: XML database

RE: XML database

[ Lists Home | Date Index | Thread Index ]

From: Evan Lenz <elenz@xyzfind.com>
To: xml-dev@lists.xml.org
Date: Mon, 09 Oct 2000 09:59:51 -0700

[similar post recently submitted to XML-L]

This is largely a "religious" issue.  Database die-hards vs. XML die-hards.
Some of the arguments for one over the other don't really stand up against
each other; they come down to personal preference or what one's familiar
with.

The Oracles of the world have figured out how to build robust, scalable,
efficient systems based on the relational data model.  XML repository
systems are only beginning to show up on the scene and a lot of issues
(transactions, speed, etc.) are far from being solved.  Instead of assuming
they can't be (or assuming they can), I'd like to focus on what I see as a
real advantage of an XML repository over an RDMBS.

Unlike relational databases, XML documents do not require a pre-defined
schema.  Thus an XML repository/information retrieval system, could receive
new documents containing new elements without breaking the system.  Yes,
with a relational database, you could simply add a new table, but that's a
manual task of changing the *schema*.  I'm not suggesting that XML is
inherently more flexible when it comes to designing schemas.  What I'm
saying is that schemas are inherently inflexible.  This is what the
semi-structured data crowd has been shouting from the rooftops.  XML
describes its own structure.

XML is clearly becoming a communication standard for B2B, e-commerce, etc.
etc. etc.  If your backend system is based on an RDMBS, you will always have
to worry about mappings between two different data models.  The XML and
relational data models are distinct.  I know there are a number of efforts
out there, but there is certainly no consensus on how these two models
should be mapped.  And I highly doubt that there's a general way to do it
that makes sense for every case.

You could, of course, flip this around and say that's why it makes no sense
to convert your database into an XML repository.  Unless, that is, you
believe that the advantages of semi-structured data will outweigh the
disadvantages.  And, if all of your data is already in XML, then a solution
that took advantage of that semi-structured data and did not require you to
map it to a database would be pretty compelling.

XML gives documents and data a common format.  In many cases, an information
retrieval system would be required for both data and documents (which might
contain metadata, eg. RDF).  Having one format (XML) would allow you to
seamlessly integrate those knowledge stores into one information retrieval
system, as well as happily integrate new document types and "schemas"
without ever having to formally define them.

At my place of employment, XYZFind Corp., we've begun to leverage the
self-describing nature of XML in order to provide keyword search over
structured data across multiple schemas.  Unlike relational databases, where
users must know the structure of the data before being able to enter a
query, users of XYZFind need not know anything about the schemas or their
structure.  Yet they are still able to execute precise queries, given a
parametric form dynamically generated from the "schema"(s) matching their
original keyword entry.  This interactive search experience automatically
adapts to changes made in the structure of the underlying XML documents, as
well as to the addition of new document types, in such a way that would be
impossible with an RDBMS.  XML is thus allowing us to bridge the gap between
open-ended, full-text keyword search and precise, parametric query.  See
these conference proceedings from SIGIR 2000:
http://www.xyzfind.com/sigir2000.htm

If you've got XML lying around and XML coming in and out and all over the
place, which is how some view the future of the world, then it makes sense
to natively use this model for storage (not necessarily the physical storage
of text documents, but the logical storage of data while being physically
stored in, say, an index).  I'm suspecting that as new solutions to these
problems are found and as the benefits of XML are realized, XML repository
proponents will start to sound less out-on-a-limb.

Evan Lenz
elenz@xyzfind.com
http://www.xyzfind.com
XYZFind, the search engine *designed* for XML
Download our free beta software: http://www.xyzfind.com/beta

References:
- XML database
  - From: Huaxin Zhang <hxzhang@cs.ualberta.ca>

Prev by Date: Re: XML Schemas: Is it possible require an element contain*non-whitespace* characters.
Next by Date: RE: interoperability (was Re: Obfuscating XML with namespaces)
Previous by thread: Re: XML database
Next by thread: Re: XML database
Index(es):
- Date
- Thread