xml-dev - RE: [xml-dev] are native XML databases needed?

RE: [xml-dev] are native XML databases needed?

[ Lists Home | Date Index | Thread Index ]

To: "Owen Walcher" <xpriori@owenwalcher.com>, <xml-dev@lists.xml.org>
Subject: RE: [xml-dev] are native XML databases needed?
From: "Hunsberger, Peter" <Peter.Hunsberger@STJUDE.ORG>
Date: Wed, 25 Aug 2004 09:09:13 -0500
Thread-index: AcSKOxJDnZOjvHbMT9Ogvka2TrRLDgAbDlxA
Thread-topic: [xml-dev] are native XML databases needed?

Owen Walcher <xpriori@owenwalcher.com> writes:

> > etc. (Possibly at this point you add in database constraints that 
> > didn't previously exist, turn certain fields non-null, etc.)
> 
> Sounds like you are using a relational database to store your 
> XML. 

(Apologies in advance if the tone of this post is a bit grumpy; not
enough sleep last night.)

Umm, who's talking about storing XML?  This was a discussion about how
to bootstrap system integrity constraints into place. At the base level
you're dealing with pure metadata, so you might have it in XML, most
likely you have it in a variety of forms.  Personally, I want an
abstract tupple store and for that a relational database meets the needs
just fine (and, yes, maybe in 5 years or so Win FS will also be an
option). I did consider adding a line to the effect of "or changing
object relationships from optional to required" but felt that most
readers of this list would be able to figure out that the basic
principles remain the same across any data store: when closing the loop
on a self validating system you've got to have a mechanism for going
from nothing required to something required.

> Talk about an impedance mismatch.  Why even bother with 
> XML in the first place if you are shredding it or clobbing 
> it? (I know, because that is what the design says, or that is 
> what is delivered to you -- but does that mean you have to 
> live in the XML world? And if you are, then move up to better 
> data management technology than something invented 20 years 
> before XML)
> 
> In order to realize the real-time XML document delivery as 
> actionable item [taken from previous post], you really need 
> to have the XML in its native form, and be able to not only 
> query within a document, but between and across documents as 
> well.  

Umm, how do I put this delicately? that's crap.  Tell me, what it the
"native form" for XML?  We've got 1000's of data entry screens that are
all variations on each other and none of the metadata is static.  For
us, XML and it's related technologies (which is where the real bang for
the buck lies) is a convenient mechanism for implementing graph
traversal mechanisms and for converting the results for presentation
purposes.  At no point in that process is there a single best
representation of "the XML" that would be a candidate to store in an
"XML database".  IOW; in order to realize real-time XML data delivery
you need to match the back end technology to the requirements at hand
and not assume a-priori that an native XML database has any place in
your architecture (the document/data distinction is intentional, but for
most purposes you can likely substitute "document" and the sentence
remains valid).

> Being able to do inserts, updates and deletes within 
> and across documents with a single command (server side) 
> without round tripping the XML documents (to the client) is 
> the only way this will scale. I don't know any way to do this 
> without a self-constructing XML database.

I've said it many times before on this list: good relational to XML
mappings are possible. If you need some help implementing them Joe
Celko's "Trees and Hierarchies in SQL for Smarties", might be a good
place to start (don't know, I haven't had a chance to read it yet, but
his original hierarchy (set/subset) traversal algorithms are used in our
system). 

Don't get me wrong, I agree, that not everyone should be implementing
their own XML store from scratch (no matter what the underlying
technology).  However, in many cases there are real business reasons to
do so.  Whether you do or don't, at some level, you don't want to know
the details of the implementation, but don't assume the underlying
technology isn't relational. Indexes are indexes and relational
databases have a lot of good tricks up their sleeves for efficient index
management.

> I say self-constructing, because in real life situations, you 
> don't necessarily know the structure/values in an XML 
> document (as has been pointed out numerous times by many 
> people in this forum) that you are receiving, but may need to 
> store it (and raise an exception) for later processing.
> 
> This is traditionally the breakdown when constraints are 
> violated (like hiring a 14 year old when the rules says 16) 
> in an RDBMS, because you cannot simply store the "almost 
> good" data due to field constraints, and although you may 
> have checked it against multiple schemas, I have never met a 
> DBA who wouldn't also implement the rule in the DB, "to make 
> sure the data is always correct".

If you're doing a dynamic system, then it's pretty clear that the
business rules don't belong in the back end.  If you're mapping XML to
relational you better be doing it at the metadata level and not at the
element level.  In such a system there is no place for a DBA to
implement such a constraint; the field storing (for example) age is
generic and can't be the subject of business constraints in the manner
you suggest.

> "Data integrity" is an oxymoron.

"Native XML" is an oxymoron....

Prev by Date: Fallacies of Validation ... RE: [xml-dev] Are people really using Identity constraints specified in XML schema?
Next by Date: Re: [xml-dev] Fallacies of Validation ... RE: [xml-dev] Are people really using Identity constraints specified in XML schema?
Previous by thread: RE: [xml-dev] Are people really using Identity constraints
Next by thread: RE: [xml-dev] are native XML databases needed?
Index(es):
- Date
- Thread