OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: XML versus Relational Database ; what is what


I think we need to to be somewhat more precise in interpretation and thus
definition of our RDBMS / XML concepts. [ Captain Shannon to shipmate :
crank up the precision! ] Also we need to link to some database history.

When talking about relational databases I think we need to distinguish
between :
1. The relational data model
2. Implementations of the relational data model in RDBMSs
3. SQL (which is sometimes used synonymously with 2).

On the XML side, XML is sometimes seen as :
1. A data format
2. A logical data model

Let's start with XML. I think XML is neither a data format nor a logical
model perse. For me XML is 'data on the move'. XML is to data, what M&Ms did
to pinda's. It is a data wrapper, a coating layer, allowing data both
dynamic and static (both structured and semi-structured) to cross system
borders. To enter new data worlds previously alien. To boldly go where no
data element has gone before.

When talking about persistent storage (databases) we're obviously talking
about static data. 
In traditional, structured database approaches, types are always fixed prior
to populating the database. Once the data is populated, its binary storage
cannot be interpreted without knowing the schema. 
Semistructured data is often explained as 'schemaless' or 'self-describing',
terms that indicate that there is no separate distinction on the type or
structure of data.

Now let's discuss some typical database components briefly : 
1. Physical data level. This level is concerned with how the data is
physically stored and what indexes are.
2. Logical data level.  You find the (logical) database schema here. Also
this level dictates f.i. what queries are valid. 
3. The external level. This provides the interfaces that applications or
users (f.i. via a standard query language) have to the data. 

We thank this distinction to the relational way of thinking. Remember that
in pre-relational databases 'application' and 'storage were intertwined. The
principal idea of the relational model was to separate the application layer
from the storage device layers. This is good, since potentially it allows
you to swap the data-storage component if something better comes along (f.i.
faster, more efficient storage). 
This is also where real-world implementations of RDBMS and relational query
languages have been disappointing. (See f.i. Date's Third Manifesto) 

When talking about XML databases I personally think the physical storage
area is the most uninteresting dimension. If better and faster ways to store
data allow me to conduct my business better, fine. If it's really helping me
to differentiate my business I'll buy it. 

Personally I find the more interesting dimensions with respect to XML
databases  :
1. The database schema will become transportable. This is very cool and
could have a big impact in several areas.
2. The previously distant worlds of structured and semi-structured data
(See quote Alan Kay in one of my other replies).

Item 2 will also have an impact at the logical level. DOM reminds me of the
'data access via paths' of the prerelational database world. SQL approaches
to XML feel to me as trying to look at the forest through rows and columns.
I think we need better ways. I think we are looking at a database paradigm


Disclaimer : my views are my own, not my employer's, unless proven