xml-dev - Re: [xml-dev] are native XML databases needed?

Re: [xml-dev] are native XML databases needed?

[ Lists Home | Date Index | Thread Index ]

To: XML Developers List <xml-dev@lists.xml.org>
Subject: Re: [xml-dev] are native XML databases needed?
From: Michael Champion <mc@xegesis.org>
Date: Wed, 25 Aug 2004 15:15:19 -0400
In-reply-to: <27651359.1093453900629.JavaMail.root@bigbird.psp.pas.earthlink.net>
References: <27651359.1093453900629.JavaMail.root@bigbird.psp.pas.earthlink.net>

On Aug 25, 2004, at 1:11 PM, Linda Grimaldi wrote:

>
> At least with the implementations I am familiar with,  natvie XML dbs 
> do their own version of shredding.  Randomly accessing XML document 
> contents efficiently requires this.  Now, their implementations are 
> optimized for XML, an important difference from relational approaches, 
> but their storage image is far from document form.  So the term 
> "native XML" is a bit misleading.  It ain't native when it is stored.  
> It just looks native when it is accessed. Relational DBs can do that, 
> too, but not without a lot of configuration and some inherent 
> inefficiencies.

I agree that "native XML," to the extent that it means anything at all, 
refers to using "native" XML idioms to define schema, add and query 
data, etc.  For what it's worth, Software AG Tamino does NOT shred 
internally, but stores XML instances as compressed, indexed text 
documents.  This is a tradeoff -- shredding optimizes for updating 
individual elements/attributes within an instance, storing intact 
optimizes for simple retrieval or replacement of the whole document, 
especially when each instance is relatively small.

>
> Can an underlying relational mechanism also efficiently perform this 
> storage image transformation in a highly automated way (without 
> requiring extensive table definitions), and in a schema-independent 
> way?  Oracle is presumably trying with their update to 10g.

As I understand it, Oracle (and the other O-X-RDBMS vendors offer both 
a shredding approach (when it is optimal) and and XML-aware CLOB 
approach for when that is appropriate.

> Haven't looked at it- not even sure it is out yet-  so I don't know 
> whether they have succeeded or not.  But relational models have shown 
> themselves to be really flexible as compared to their hierarchical 
> counterparts in the past.

Yes. XML DB's are optimized for the set of cases where the hierarchical 
relationships are the most important ones.  XQuery extends this with 
its ability to do Joins, we shall see if this works well in practice.  
The relational model is completely general and can handle anything, but 
can be quite unwieldy and inefficient in practice when order and 
hierarchy [which are difficult to model in set theory!] hits a critical 
mass.  My favorite example would be an industrial-strength technical 
manual. Codd proved that you CAN normalize all that ordered, 
hierarchically structured, textual and data-oriented information and 
pull it back together with the relational calculus, but I've never 
heard of anyone actually pulling that feat off with real data and real 
DBMS software.  [Sure, just a simple 100-way join, no problem  :-) ]
>
> And, of course, they don't have to be as good at it as a native XML 
> DB.  They just have to be good enough.  The same "impedence mismatch" 
> cry that the XML dbs are issuing now reminds the entire community an 
> awful lot of the object-oriented dbs of the early 90s.

Yup.  So why are people still gnashing their teeth about the 
object-relational impedance mismatch rather than using OODBMS?  I'd 
argue that the failure of the OODBMS market has more to do with the 
lack of standardization than the intrinsic superiority of the 
relational approach.  After all, the response by the RDBMS -- now 
ORDBMS! -- vendors was to find the 80/20 point in the OODB approach 
that fit easily with their technology and SQL could be extended to 
accommodate.  They are doing the same thing with XML (OK, some are 
adding XQuery rather than extending SQL to handle XML), and will 
probably get the "80%"  of the market that can be satisfied with a 
"good enough" solution.  That leaves the XML DB's which don't have the 
installation/operational/complexity overhead of all that non-XML stuff 
with a relatively small *percentage* of the market: after all, the data 
that doesn't fit neatly in XML hierarchies is not really suitable for 
an XML DB, and the potential customers who already have a glass room 
full of Oracle or DB2 boxes and DBAs won't care about the overhead.  On 
the other hand, the sheer volume of XML being produced and the relative 
standardization of XML interfaces may leave a pretty decent *quantity* 
of opportunities for the XML-optimized database products.

So that, in my humble but biased view, is where XML databases are 
needed -- when you have a lot of data to manage that is already in XML 
format (especially when there are a variety of schemas that evolve 
unpredictably), when reliable/scalable/queryable storage is needed at 
the periphery of an organization or network where it is impractical to 
deploy and manage full-fledged O-X-RDBMS products, and where leveraging 
XML standards/expertise has more business benefit than leveraging a SQL 
installed base.

References:
- Re: [xml-dev] are native XML databases needed?
  - From: Linda Grimaldi <grimlinda@earthlink.net>

Prev by Date: Re: [xml-dev] Fallacies of Validation ... RE: [xml-dev] Are people really using Identity constraints specified in XML schema?
Next by Date: FW: [xml-dev] Fallacies of Validation ... RE: [xml-dev] Are people really using Identity constraints specified in XML schema?
Previous by thread: Re: [xml-dev] are native XML databases needed?
Next by thread: RE: [xml-dev] are native XML databases needed?
Index(es):
- Date
- Thread