xml-dev - RE: Why XML Over the Relational Model?

RE: Why XML Over the Relational Model?
[ Lists Home | Date Index | Thread Index ]
From: RalfW@www.basicworld.com (Ralf Westphal / BasicPro)
To: "'Dan Holle'" <dan@holle.demon.co.uk>,"'Paul Butkiewicz'" <arabbit@earthlink.net>,<xml-dev@ic.ac.uk>,<xlxp-dev@fsc.fujitsu.com>,<simonstl@simonstl.com>,<bckman@ix.netcom.com>
Date: Mon, 4 Jan 1999 09:39:00 +0100
Very interesting topic!

But how a about the following twist:

We keep our relational databases for speed etc. and make them accessible
thru the DOM. That way they´d appear as if they were huge XML files.

It would be a win-win solution:
-We´d retain all advantages of a dedicated API and a dedicated database
file format, e.g. indexes.
-Plus we´d gain a standard API/interface to the data.

Now a user wouldn´t have to care whether data she needs to query resides
in a true XML file or in a RDBMS. All she needs is a XQL processor
(take XQL just as an example for a XML query language) and a list of
filenames.

She then would feed each filename (and her query) to the XQL processor, and
leave it to the XQL processor to figure out how to get to the data. The
XQL processor, for example, would look at the filename extension and
instantiate an appropriate XML parser or DOM object. E.g. for files
with the extension XML it would instantiate a regular XML parser (like
MSXML from Microsoft), pass it the XML filename and then execute the
query using the DOM. For MDB files on the other hand
(MS Access database files) the XQL processor would not instantiate a
XML parser (wouldn´t make much sense ;-), but instead a "DOM-component".

A DOM-component implements XML DOM interfaces, so it looks just like a
XML parser doing the same. So from the outside, one cannot discern if behind
a DOM-component is a XML file or something else.

Since the DOM-component for MDB files behaves just like a true XML DOM (no,
even
better, it _is_ a true XML DOM), the XQL processor again can execute the
query
itself.

Benefits:
-all files and data structures (e.g. file system, list of running processes)
could be made accessible thru a single API: the DOM
-a single query language could be used to query (manipulate?) any data
structure
-even transparent querying across data structures would be possible (e.g.
a query could start in a DOM for the file system but could easily continue
_into_ files pointed to). This kind of querying could be called "universal
querying".

Possible Objections:
-Why don´t we just convert all files to XML? We wouldn´t need
DOM-components.
Current database file formats are optimized for the purpose. XML is a good
format for read-mostly/read-only purposes (e.g. in data exchange scenarios);
but I don´t see megabyte databases stored in XML. The POET ODBMS for example
converts XML files to its proprietary ODBMS file format before publishing it
thru a DOM.
Where XML is a file format ideal for many purposes, but not for all, the XML
DOM is a truely universal interface to any data, be it the file system or
a text file. It´s just another kind of API.

-But databases as no XML files so how can we see the thru a DOM?
The existence of a DOM does not depend on a XML file. A DOM is just a
hierarchy
of objects. It´s easy to instantiate a hierarchy of DOM-node objects for
a hierarchy of directories and files. A DOM-node for a directory, for
example,
could return "folder" as its tagName property; and a DOM-node for a file
could
have a attribute node called "dateCreated".
We can see clearly a DOM-component does not need an external XML DTD or XML
schema.
Even the concepts of wellformedness and validity loose their values since
their
is not data to be parsed. Or the other way round: the data a DOM-component
is
always wellformed and valid (because the DOM-component sits on top of a
specific
API which ensures the correctness so to speak).
The hierarchy of node objects in a DOM-component thus does not depend on a
real DTD or
real schema, but is predefined by "virtual" schema. The designer of a
DOM-component
simply maps a given data structure (e.g. a hierarchy of directories and
files) to
a hierarchy of DOM-node objects with certain tag names and attributes and
values.

-The same XQL query wouldn´t work on different data structures?
True. But the same holds for queries on RDBMS once they have different table
structures.
If you want to query a couple of address databases you´d probably need to
formulate several SQL queries, since ZIP information in one database is
stored in a
column called "ZIP" and in another it´s called "PostalCode".
On the other hand universal querying would still work for files of the same
type, e.g.
all MDB files. If you assume a MDB file is represented by a hierarchy of
nodes with
tag names tables/table name=.../row/col name=... you could ask questions
like "return
all rows in all tables containing a column named either ´ZIP´ or
´PostalCode´ which
contains the value ´20099´". Fed to a XQL processor with a couple of MDB
filesnames
you´d need only this one query to retrieve all addresses with ZIP ´20099´.

-Universal querying could lead to bad performance!
True. If you´d start a query on a file system DOM-component and let it
recurse into
files the performance could go down easily. But then: why not let the user
decide.
Maybe it´s more convenient for him to wait than not be able to get at the
requested
information at all. At leat he has the possibility to issue a very general
query and
not care about file types and file boundaries. DOM-components and XQL
processors
are a very powerful concept.
Also, XQL processors and DOM-components could be made more "intelligent".
For example,
before querying a DOM the XQL processor could ask, "do you contain a
hierarchy of
row/col nodes at all?" If not, the XQL processor could immediately skip the
whole
data structure and continue with the next one.

-But XQL doesn´t allow things like spanning of files etc.
True, XQL can´t do that - yet. But why not think of extending XQL?
Today
  folder/file/@name
results in a list of attribute nodes.
Today
  folder/file/@name/tables/table
does not work.
But tomorrow
  folder/file/@name/tables/table
could mean, "use the values of the name-attributes as file names, for each
file name
instantiate the appropriate DOM-component, and continue the query with the
rest of
the query string into the DOM-component".
Other ways of marking where spanning XML structures/DOMs could be thought
of.

Hope I didn´t bore you guys too much ;-) But wouldn´t a world with universal
querying
capabilities great?

Cheers,

Ralf


-----Original Message-----
From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
Dan Holle
Sent: Sonntag, 3. Januar 1999 20:25
To: Paul Butkiewicz; xml-dev@ic.ac.uk; xlxp-dev@fsc.fujitsu.com
Subject: Re: Why XML Over the Relational Model?


>>The primary answer I give this question is flexibility, though there is a
>>significant cost in efficiency.  XML documents can easily hold structures
>>that make relational databases choke. . . .
>
>I would love to see an example of this.
>


Me too, Paul.

Let's not think of XML as a representation for a complex multi-table
multi-user shared database.  DOM, as a database, is like a RAM-resident
single user IMS/DB.  (If you must barf, don't barf on your keyboard.)  There
is a reason why we fled from hierarchical linked databases to relational.

Yes, there are databases you can do in XML that you can't do in relational.
Just as  there are things you can do in assembler language you can't do in
Java.  But if you are trying to do something useful with large, complex
data, stick with relational.

XML seems a good match for small but flexible structures on web-connected
clients.  David's comments, saying XML is for information exchange and
relational is for storage/query, rings true...


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
References:
- Re: Why XML Over the Relational Model?
  - From: "Dan Holle" <dan@holle.demon.co.uk>
Prev by Date: RE: Why XML Over the Relational Model?
Next by Date: Re: sub-documents
Previous by thread: Re: Why XML Over the Relational Model?
Next by thread: Re: Why XML Over the Relational Model?
Index(es):
- Date
- Thread