OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] XML-enabled databases, XQuery APIs

[ Lists Home | Date Index | Thread Index ]

> I'm neutral here, but for an opposing view see "XML Parsing: 
> A Threat to 
> Database Performance" [1] by Matthias Nicola of IBM. From the 
> abstract:
> "XML parsing is generally known to have poor performance 
> characteristics 
> relative to transactional database processing. Yet, its potentially 
> fatal impact on overall database performance is being 
> underestimated. We 
> report real-word database applications where XML parsing 
> performance is 
> a key obstacle to a successful XML deployment..."
> -- Ron

Thanks for the reference. The paper puzzles me, though. In section 4.1 they
say that:

* parsing a 100K document typically takes 175K instructions

* inserting a row into a relational table requires 30K to 200K instructions.

I would have expected that in a system using shredding, a 100K XML document
would result in say 200 rows being added to the database, giving an insert
cost of 6M to 40M instructions compared with a parsing cost of 175K
instructions, broadly in line with the "two orders of magnitude" that I
guessed in my original post. Clearly I'm thinking of a different kind of
application from the one that they studied.

Michael Kay

> [1] 
> http://lists.w3.org/Archives/Public/www-ws/2004Oct/att-0032/MN
> icola_CIKM_2003_1_.pdf
> Michael Kay wrote:
> > I would be very surprised if XML parsing contributes 
> anything noticeable to
> > the cost of a database load (in shredding mode). Except 
> possibly for a
> > pathological XML document containing 3 nodes and 3 billion bytes.
> > 
> > I haven't looked at the latest products from MS or Oracle, 
> but my experience
> > of database loading with complex data and a realistic level 
> of indexing is
> > that it's a couple of orders of magnitude slower than XML 
> parsing. You can
> > improve that with a custom loader that bypasses SQL and 
> does a lot of
> > heavy-duty sorting and merging to minimize head movement on 
> the disk (does
> > anyone still do that?), but I think it's still true that 
> the parsing cost is
> > immaterial.
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://www.oasis-open.org/mlmanage/index.php>


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS