[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [xml-dev] XML Database Decision Tree?
- From: Dan Weinreb <dlw@exceloncorp.com>
- To: Mike.Champion@SoftwareAG-USA.com, xml-dev@lists.xml.org
- Date: Fri, 19 Oct 2001 17:02:12 -0400 (EDT)
Regarding the anti-XML-database paper entitled "Native XML databases;
a bad idea for data?", I agree with what Mike Champion says. I have a
few points to add.
This whole article is tacitly based on the assumption that what you
are *really* trying to do is to store and query structured data. His
attitude is, well, yeah, if what you're doing is documents, well, OK,
but if what you're doing is structured data, then these XML databases
are really no good.
His point is that if you're trying to do those things that relational
databases are specifically designed to do, and maybe want to do a few
XML-related things on the side, you should use a relational database.
To which I say, sure.
Mike Champion cuts directly to the heart of the issue when he says:
"Do you really want your information's structure to vary?" Uhh, no ... but
do I usually get a vote?
Right, exactly. One of the main virtues of (data-centric) XML's data
model is its handling of semi-structured data. Relational DBMS's do
not handle semi-structures data very well (although the details vary
depending on the individual RDBMS). The article claims:
However, decomposing the XML document to persist it to a
relational database is not all that difficult;
If you know in advance the complete details of the structure of the
information that you're going to get, and you've created a set of
relational tables that can represent every bit of the information from
the XML document in appropriate normalized form, *then* it's not all
that difficult.
But, as Mike Champion himself said in a talk he gave a few days ago
(paraphrasing slightly), what if you receive a one million dollar
purchase order, and your B2B software says to you "nyahh, nyahh, this
XML purchase order has a child element that doesn't map directly into
my tables and columns, so I'm going to reject it!". (Mike, correct
me if I'm not conveying your points properly.)
In the real world, one tries one's best to establish standards, but
inevitably requirements and needs change over time as new users enter
the arena, as the software is expanded to take on a wider scope of
jobs, as business practices change slightly over time, and so on. The
result is that it's hard to keep everything inside totally tight and
restrictive schemas that are created once and expected to remain the
same forever. XML provides a framework in which formats can evolve,
carefully and with due deliberation and consideration for backward
compatibility. If someone adds a new child element somewhere, my
existing XPath expressions keep right on working, as long as the
addition hasn't disrupted the semantics too harshly.
Mike said:
2) Retrieval: "some of the native XML platforms require that the entire
document be returned from the database" Uhh, which ones?
I read a product review of XYZFind in one of the trade magazines (I
can't remember which one) that suggested the XYZFind always returns
entire XML documents from the database. The author of the article
cited XYZFind as one example of a native XML database system. So if
what I read in the magazine was right, then the author of the article
might be right that *some* have this property. (If I misread or
misinterpreted the magazine article, or the article was wrong, I
apologize in advance.) This use of the word "some" is a well-known
propaganda technique, designed to leave the reader with the impression
that *all* have this property.
The article goes on to say:
Other native XML platforms
decompose XML documents before persisting them to the repository, but
this tends to be a little clunky if you have a complicated document
structure (as many structured data XML documents tend to have).
I have no idea what he's talking about here. If you are going to
store XML data as relational rows, you have to decompose it into those
rows. What on earth does he mean by "tends to be a little clunky"?
You give the XML document to the XML database system and it just plain
does what it's supposed to do; no muss, no fuss.
In the section on "Searching", the article says:
Because of the document-centric nature of these databases,
searching will return only a set of XML documents;
Why does hs persist is claiming that queries return "only a set of XML
documents"? This is entirely untrue in the case of eXcelon XIS and
surely of any XML DBMS that can execute an XPath expression.
When he says "Native XML databases also do not deal well with pointing
relationships", I'm not exactly sure what he means, but I think his
point is that full SQL is a more powerful query language than what
most (maybe all) XML DBMS's provide, which I think is a fair claim.
In the "Aggregation" section, I dispute the exact things he's saying,
but the larger claim that SQL is more powerful when it comes to
aggregation is true. The question is, is that what you're asking
your XML database system to do? He thinks so:
If your structured data application requires any sort
of analytical processing -- and I'll bet it does -- a native XML
database is going to disappoint you.
His point is that if you're trying to do those things that relational
database systems were explicitly designed to do, you're usually better
off with a relational database system. Sure, I am happy to concede
wholeheartedly.
- -- Dan
------- End of forwarded message -------