OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   XML: logical and/or physical model?

[ Lists Home | Date Index | Thread Index ]

I had a somewhat different reaction than most people to the  logical  
models thread (and Date's keynote speech  
http://searchdatabase.techtarget.com/originalContent/ 
0,289142,sid13_gci962948,00.html that inspired  it):

One major selling point of the relational model is to separate the  
logical model of data from the physical implementation of a DBMS.  Date  
alludes to this in his rant against the frequent OODBMS / XMLDBMS  
analogy about disassembling a car when you come home at night and  
reassembling it in the morning. 'Anyone who uses that analogy, Date  
said, displays a "lack of understanding of the difference between the  
logical and physical model." The use of the terms "flat tables" or "2D  
tables" to describe data stored in a relational database is wrong, he  
added.'

As I understand it, it is the job of the RDBMS implementation to  
perform whatever mapping from the logical to physical world is needed  
to do this efficiently. [1]  Perhaps one  advantage of XML is that it  
just blows off this distinction -- it gets a lot of its practical power  
by 'modeling' relationships as *physical* containment of a set of  
elements (which of course may be subtrees) inside other elements. As a  
logical model, this suffers from all the limitations that Codd exposed  
in the 1970's, but as a pragmatic way of handling text and data that  
tends to be ordered and hierarchical, it has a lot going for it:
- It is generally going to be easier to implement efficiently in  
read-only or dataflow/pipeline processing applications where  
referential integrity is not an issue because the XML document itself  
defines the relevant context.
- It scales / parallelizes well if all the information needed to  
perform some business process is carried around in a discrete chunk  
that only requires access to a transactional DBMS at the beginning and  
end of a business process. (Likewise, it relatively easily supports  
optimistic or compensation-based transaction processing).
- It maps fairly directly to business-level documents (e.g. orders,  
invoices, etc. that can be directly represented as XML documents but  
generally normalize to a significant number of tables), thus  
facilitating communication between the developers and users of  
software.  (In the best case, the "business" view of the document *is*  
the XML with a stylesheet applied).
- It greatly reduces the need for DBAs, part of whose job is to  
maintain the logical-physical mapping.

In my not-so-objective opinion, XML's success mirrors the ongoing  
success of post-relational DBMS such as Adabas  that adapt the storage  
model to the physical data structure of the application rather than  
asking the application to adapt to the logical model of the DBMS. [Yes,  
  I assert that Adabas, invented in 1969, is a POST-relational DBMS  --  
very visionary!   ].  Clearly there are downsides of this (as the  
relational proponents have pointed out for decades), but there are also  
distinct advantages in terms of performance, robustness, etc. in a lot  
of situations where the relational model's intrinsic advantages are not  
relevant.

So, my  question is whether this characterization of XML as an  
essentially physical model rather than a logical one makes sense? Of  
course, the Infoset and XQuery treat XML as a logical model that is  
independent of the "physical" serialization or DBMS implementation, so  
what I'm talking about is more of a design pattern for using XML than  
an intrinsic property of XML itself, and is independent of whether one  
thinks of XML as a labeled tree data structure or Unicode text with  
angle brackets.

At any rate, the relational model doesn't need to be defended from XML,  
it lives in a different plane of reality.  Date and others really seem  
to be making a *political* objection:  people have stopped putting  
pressure on the DBMS vendors to support the pure relational model in a  
way that is efficient, reliable, and easy to use for messy real-world  
data.  Customers and RDBMS vendors have started using XML to address  
that set of problems that it handles relatively easily but are still at  
the bleeding edge of relational technology, e.g. where order and  
hierarchy are critical and relationships are easily modeled via  
containment.


[1] Ken North mentioned D.L. Childs STDS work that influenced the  
relational model.  Childs is a neighbor of mine, and has a rather  
interesting metaphor for this:  The logical model is up in the world of  
sunshine and light where the Eloi dwell with little concern for ugly  
realities; the physical model is the one that lives down in the land of  
the Morlocks who do the dirty work.  It's nice to be oblivious to  
physical reality, but this sometimes leads to a really unpleasant  
realization when you find out where you really sit in the food chain  
:-)
BTW I have a bunch of Childs recent stuff archived at  
http://xsp.xegesis.org if anyone is interested in seeing where the STDS  
thinking has gone in the 35-or so years since Codd cited it; I am  
intrigued because he proposes a way to formally unite the relational  
model and XML's implicit data model using an extended set theory that  
makes order and hierarchy first class citizens.  In STDS (which I used  
via an early RDBMS called Micro at the University of Michigan back when  
dinosaurs roamed the earth), there is a formal relationship between the  
logical and physical models, which allows query optimization to be  
driven down almost to the hardware level.





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS