[
Lists Home |
Date Index |
Thread Index
]
- To: xml-dev@lists.xml.org
- Subject: The relational model is to data as automata theory is to programs?
- From: Mike Champion <mc@xegesis.org>
- Date: Wed, 27 Aug 2003 08:47:08 -0700 (PDT)
The "XML is kinda OK as a data interchange format but
inappropriate as a data model for large-scale
applications" megathread has inspired the following
brain fart. How stinky is it? :-)
The relational model is to data as automata theory is
to programs. That is, there is a well-understood
formal theory showing that all data can be represented
and manipulated using the operands and operators of
the relational model, just as there is a well-defined
formal theory showing that all effective procedures
can be modelled as Turing machines or Lambda calculus
programs, or whatever. That's all well and good, and a
staple of rigorous computer science education. What's
different between the data and process worlds is that
the process people seem to have a good grasp on the
engineering realities: There is certainly a large
domain in which formal automata are the "correct" way
to approach real problems (protocol definitions and
implementations come to mind, and parsers for certain
well-defined grammars), but there is a realization
that this is impractical as a universal software
engineering methodology. There are, however, a number
of people who seem to be asserting that any database
application or DBMS system that doesn't fully and
exclusively support the relational model is somehow
broken -- in violation of well-understood "scientific"
truths.
(I'm somehow reminded of the Skeptics Dictionary
definition of "scientism", which I think of as a
quasi-religious advocacy of the trappings of science
without an appreciation for its messy realities:
"Scientism, in the strong sense, is the
self-annihilating view that only scientific claims are
meaningful, which is not a scientific claim and hence,
if true, not meaningful. Thus, scientism is either
false or meaningless. ")
Forcing oneself to design a terabyte-scale
semi-structured document database with associated
metadata whose schema is sure to evolve using a pure
relational model strikes me like trying to design a
bridge using the mathematics of quantum physics; it's
no slur on Heisenberg's accomplishments to decline to
even think about it. Likewise, it's no slur on Turing
to build a "tag soup" web browser without explicit
reference to formal automata, and its no slur on Codd
if one concludes that there is no direct, practical
way to apply his model to applications with complex,
evolving, partially-understood data requirements.
XML, on the other hand, seems to have a set of
practically useful tools to use in those situations.
When to use one approach and/or the other is something
we are learning about, not something we will prove by
deduction.
So, to echo what others have said, one could have a
sensible discussion on whether a specific database
application might be best approached via a pure
relational model, a native XML model, or some hybrid,
but arguing on the basis of pure theory or alleged
universal principles seems totally pointless.
|