Lists Home |
Date Index |
On Aug 25, 2004, at 1:11 PM, Linda Grimaldi wrote:
> At least with the implementations I am familiar with, natvie XML dbs
> do their own version of shredding. Randomly accessing XML document
> contents efficiently requires this. Now, their implementations are
> optimized for XML, an important difference from relational approaches,
> but their storage image is far from document form. So the term
> "native XML" is a bit misleading. It ain't native when it is stored.
> It just looks native when it is accessed. Relational DBs can do that,
> too, but not without a lot of configuration and some inherent
I agree that "native XML," to the extent that it means anything at all,
refers to using "native" XML idioms to define schema, add and query
data, etc. For what it's worth, Software AG Tamino does NOT shred
internally, but stores XML instances as compressed, indexed text
documents. This is a tradeoff -- shredding optimizes for updating
individual elements/attributes within an instance, storing intact
optimizes for simple retrieval or replacement of the whole document,
especially when each instance is relatively small.
> Can an underlying relational mechanism also efficiently perform this
> storage image transformation in a highly automated way (without
> requiring extensive table definitions), and in a schema-independent
> way? Oracle is presumably trying with their update to 10g.
As I understand it, Oracle (and the other O-X-RDBMS vendors offer both
a shredding approach (when it is optimal) and and XML-aware CLOB
approach for when that is appropriate.
> Haven't looked at it- not even sure it is out yet- so I don't know
> whether they have succeeded or not. But relational models have shown
> themselves to be really flexible as compared to their hierarchical
> counterparts in the past.
Yes. XML DB's are optimized for the set of cases where the hierarchical
relationships are the most important ones. XQuery extends this with
its ability to do Joins, we shall see if this works well in practice.
The relational model is completely general and can handle anything, but
can be quite unwieldy and inefficient in practice when order and
hierarchy [which are difficult to model in set theory!] hits a critical
mass. My favorite example would be an industrial-strength technical
manual. Codd proved that you CAN normalize all that ordered,
hierarchically structured, textual and data-oriented information and
pull it back together with the relational calculus, but I've never
heard of anyone actually pulling that feat off with real data and real
DBMS software. [Sure, just a simple 100-way join, no problem :-) ]
> And, of course, they don't have to be as good at it as a native XML
> DB. They just have to be good enough. The same "impedence mismatch"
> cry that the XML dbs are issuing now reminds the entire community an
> awful lot of the object-oriented dbs of the early 90s.
Yup. So why are people still gnashing their teeth about the
object-relational impedance mismatch rather than using OODBMS? I'd
argue that the failure of the OODBMS market has more to do with the
lack of standardization than the intrinsic superiority of the
relational approach. After all, the response by the RDBMS -- now
ORDBMS! -- vendors was to find the 80/20 point in the OODB approach
that fit easily with their technology and SQL could be extended to
accommodate. They are doing the same thing with XML (OK, some are
adding XQuery rather than extending SQL to handle XML), and will
probably get the "80%" of the market that can be satisfied with a
"good enough" solution. That leaves the XML DB's which don't have the
installation/operational/complexity overhead of all that non-XML stuff
with a relatively small *percentage* of the market: after all, the data
that doesn't fit neatly in XML hierarchies is not really suitable for
an XML DB, and the potential customers who already have a glass room
full of Oracle or DB2 boxes and DBAs won't care about the overhead. On
the other hand, the sheer volume of XML being produced and the relative
standardization of XML interfaces may leave a pretty decent *quantity*
of opportunities for the XML-optimized database products.
So that, in my humble but biased view, is where XML databases are
needed -- when you have a lot of data to manage that is already in XML
format (especially when there are a variety of schemas that evolve
unpredictably), when reliable/scalable/queryable storage is needed at
the periphery of an organization or network where it is impractical to
deploy and manage full-fledged O-X-RDBMS products, and where leveraging
XML standards/expertise has more business benefit than leveraging a SQL