[
Lists Home |
Date Index |
Thread Index
]
I had a somewhat different reaction than most people to the logical
models thread (and Date's keynote speech
http://searchdatabase.techtarget.com/originalContent/
0,289142,sid13_gci962948,00.html that inspired it):
One major selling point of the relational model is to separate the
logical model of data from the physical implementation of a DBMS. Date
alludes to this in his rant against the frequent OODBMS / XMLDBMS
analogy about disassembling a car when you come home at night and
reassembling it in the morning. 'Anyone who uses that analogy, Date
said, displays a "lack of understanding of the difference between the
logical and physical model." The use of the terms "flat tables" or "2D
tables" to describe data stored in a relational database is wrong, he
added.'
As I understand it, it is the job of the RDBMS implementation to
perform whatever mapping from the logical to physical world is needed
to do this efficiently. [1] Perhaps one advantage of XML is that it
just blows off this distinction -- it gets a lot of its practical power
by 'modeling' relationships as *physical* containment of a set of
elements (which of course may be subtrees) inside other elements. As a
logical model, this suffers from all the limitations that Codd exposed
in the 1970's, but as a pragmatic way of handling text and data that
tends to be ordered and hierarchical, it has a lot going for it:
- It is generally going to be easier to implement efficiently in
read-only or dataflow/pipeline processing applications where
referential integrity is not an issue because the XML document itself
defines the relevant context.
- It scales / parallelizes well if all the information needed to
perform some business process is carried around in a discrete chunk
that only requires access to a transactional DBMS at the beginning and
end of a business process. (Likewise, it relatively easily supports
optimistic or compensation-based transaction processing).
- It maps fairly directly to business-level documents (e.g. orders,
invoices, etc. that can be directly represented as XML documents but
generally normalize to a significant number of tables), thus
facilitating communication between the developers and users of
software. (In the best case, the "business" view of the document *is*
the XML with a stylesheet applied).
- It greatly reduces the need for DBAs, part of whose job is to
maintain the logical-physical mapping.
In my not-so-objective opinion, XML's success mirrors the ongoing
success of post-relational DBMS such as Adabas that adapt the storage
model to the physical data structure of the application rather than
asking the application to adapt to the logical model of the DBMS. [Yes,
I assert that Adabas, invented in 1969, is a POST-relational DBMS --
very visionary! ]. Clearly there are downsides of this (as the
relational proponents have pointed out for decades), but there are also
distinct advantages in terms of performance, robustness, etc. in a lot
of situations where the relational model's intrinsic advantages are not
relevant.
So, my question is whether this characterization of XML as an
essentially physical model rather than a logical one makes sense? Of
course, the Infoset and XQuery treat XML as a logical model that is
independent of the "physical" serialization or DBMS implementation, so
what I'm talking about is more of a design pattern for using XML than
an intrinsic property of XML itself, and is independent of whether one
thinks of XML as a labeled tree data structure or Unicode text with
angle brackets.
At any rate, the relational model doesn't need to be defended from XML,
it lives in a different plane of reality. Date and others really seem
to be making a *political* objection: people have stopped putting
pressure on the DBMS vendors to support the pure relational model in a
way that is efficient, reliable, and easy to use for messy real-world
data. Customers and RDBMS vendors have started using XML to address
that set of problems that it handles relatively easily but are still at
the bleeding edge of relational technology, e.g. where order and
hierarchy are critical and relationships are easily modeled via
containment.
[1] Ken North mentioned D.L. Childs STDS work that influenced the
relational model. Childs is a neighbor of mine, and has a rather
interesting metaphor for this: The logical model is up in the world of
sunshine and light where the Eloi dwell with little concern for ugly
realities; the physical model is the one that lives down in the land of
the Morlocks who do the dirty work. It's nice to be oblivious to
physical reality, but this sometimes leads to a really unpleasant
realization when you find out where you really sit in the food chain
:-)
BTW I have a bunch of Childs recent stuff archived at
http://xsp.xegesis.org if anyone is interested in seeing where the STDS
thinking has gone in the 35-or so years since Codd cited it; I am
intrigued because he proposes a way to formally unite the relational
model and XML's implicit data model using an extended set theory that
makes order and hierarchy first class citizens. In STDS (which I used
via an early RDBMS called Micro at the University of Michigan back when
dinosaurs roamed the earth), there is a formal relationship between the
logical and physical models, which allows query optimization to be
driven down almost to the hardware level.
|