RE: storing xml files into database

Haresh:

Since I work for a native XML database company, you might not consider my advice to be valid… but I’m going to give it anyways ;->

I have customer story after customer story where the first thing they tried was Oracle or DB2 or SQL Server for storing their XML. It was a natural thing for them to try as they already had those products installed. They came to eXcelon primarily because they failed at making an RDBMS handle XML in a way that was useful. One of our customers, for example, had Oracle consultants create a benchmark on Oracle 9i against our consultants using eXcelon. Oracle was unable to complete the benchmark and we won the deal… and these were top-notch Oracle consultants.

A couple of questions that will quickly let you know that an RDBMS is not going to work:

1) Are you dealing with XML data whose schema is not necessarily consistent from document to document or the schemas change often? The key advantage of using XML as a data model is extensibility: you lose that advantage when you store it in an RDBMS.

2) Do you need to update this data often? All current RDBMS implementations (except for a beta utility available for Microsoft for SQL Server) do not allow incremental document updates. You must replace entire documents when you need to make a change as all of the RDBMS systems store the data as BLOBs. Most of our competitors in the XML DB world also do not allow incremental updates. This is very important should data need to change fairly often.

3) Is performance and scalability important to you? We’ve tried both Oracle and DB2 with over 100,000 documents, which is a relatively small amount. Oracle fails the benchmark, and DB2 goes very slow. Why? They have to parse the data on the fly to do anything with it: query, index, anything. eXcelon XIS stores the data in parsed form so we eliminate this overhead PLUS our caching technology is distributed and works like an in-memory database to give you really fast access.

4) Do you have data that comes from business partners or from systems where you really do not have much control over the schema? Does that data need to be integrated into your common XML data model? If so, keeping that all coordinated in an RDBMS is very difficult. This is the problem that has afflicted most of our customers and a primary reason why they failed in using an RDBMS.

5) How many nodes wide and how many nodes deep is your XML data? If you decide to go the mapping route (which is the only way to avoid having it stored as a BLOB), the mappings will fail or become immensely complex the wider and deeper the XML data gets

6) The RDBMS systems are absolutely atrocious at XML indexing. Their approach to XPath indexing is to index every possible XPath that matches criteria rather than the one you’re really going to be querying on the most. This makes queries quite inefficient compared to native XML DB implementations. For example, eXcelon XIS is up to 20x faster on indexed queries than Oracle 9i. XIS does indexing at the node level and is very efficient. Very few native XML DB implementations can even claim that.

You’re welcome to try to make an RDBMS work if your data does not change often, all of your data is your own and you have complete control over the schemas, and you don’t have that many documents (although it sounds like you do). Our customers have tried and failed, and these are Global 2000 companies with all of the consultants in the world to throw at the problem. Maybe you’ll succeed where they failed ;->

Cheers,

Chris

---------------------------------------

Chris Parkerson

Product Manager

eXcelon Corporation

Burlington, MA

(781) 674-5393

http://www.exceloncorp.com

---------------------------------------

-----Original Message-----
From: Haresh Gujarathi [mailto:gharesh@vsnl.com]
Sent: Monday, September 10, 2001 8:56 AM
To: Chuck White
Cc: Xml-Dev
Subject: RE: storing xml files into database

Hi Chuck,

thx for ur reply.

I have already looked into the mapping tables mechanism. That is too painful and will not work for me.

Now this question keeps coming into my mind; (if you read all the messages in this thread) There are companies like excelon, Ipedo, Software Ag(tamino), IXIASFOT which offer a native xml server (where xml is stored in native format as against tabular format in typical RDBMSs)

I wonder what does it take to MS-SQL server, oracle, db2 guys to do just the same. Is that too difficult?

What would it take to them to do what tamino, ipedo, excelon does?

regards

haresh

-----Original Message-----
From: Chuck White [mailto:chuck@tumeric.net]
Sent: Sunday, September 09, 2001 10:28 PM
To: gharesh@vsnl.com
Subject: RE: storing xml files into database

Hi Haresh:

I'm responding to your inquiry off list because I don't think it's quite the forum for this.

You might want to look at the XML database from Ipedo (www.ipedo.com). I've talked to the CEO there and he is pretty convincing about it being considerably faster than Oracle for handling large XML docs. It also will have failover, concurrency and other rdms features you would expect to find in any modern rdms. It isn't cheap, you'll have to negotiate for a decent price on the licensing but i think you'll find them reasonable. I am sure they have a free download, too.

Alternatively, you can map sql tables to xml tables using XSQL in Oracle or XPath in SQLServer 2000. This mapping is laborious and labor intensive.

The cheapest solution is to store it as a blob in MySQL but if you have a lot of site traffic you're going to run into some problems, especially if you try to manipulate the XML document using the DOM. It all really depends on what you are trying to accomplish, does't it?

Chuck White
CEO
The Tumeric Partnership
Advertising for the Next Generation.
http://www.tumeric.net
chuck@tumeric.net
________________________________________
Co-Author, Mastering XML, Premium Edition
Sybex Books, May, 2001
http://www.javertising.com

-----Original Message-----
From: Haresh Gujarathi [mailto:gharesh@vsnl.com]
Sent: Sunday, September 09, 2001 5:20 AM
To: Xml-Dev
Subject: storing xml files into database

I am developing an internet based application where the the data of the users of that application is stored in xml format.

I expect about 1000 to 20000 total users of the system (and say about 100 users simultaneously accessing the system)

Naturally there is inconvenience in storing the xmls on the file system. I want to explore storing these xmls in the database. I need some inputs as to how the xmls can be stored in the rdbms database. Please provide me any insight regarding this.

I am looking for 2 solutions ; a zero cost solution and reasonable cost solutions (may be involving the xml aware databases like Ixiasoft xml server)

1. In considering a database as storage mechanism, what advantages I would get

2. In such scenario what other people do?

3. If I want to store the xmls in say MySQL, how do I do it and does mysql have any special features?

4. What about MS-SQL server;

5. Should xmls be stored as blobs in the rdbms databases?

6. Optionally, I am also looking for 'search' ability where the database should be able to index the elements of xml file and perform search.

regards

Haresh Gujarathi