OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] XML-enabled databases, XQuery APIs

[ Lists Home | Date Index | Thread Index ]

> Michael Kay wrote:
> > I would be very surprised if XML parsing contributes anything noticeable to
> > the cost of a database load (in shredding mode).

These benchmarks were run on different testbeds so this isn't an
apples-to-apples comparison.

This parser performance test of Expat (C, SAX) reported a best time of 0.05 sec
to parse an 884K document with 32K nodes. For 2500 documents, that would be
approx. 125 seconds.
http://okmij.org/ftp/Scheme/SSAX-benchmark-1.html

This benchmark compared two SQL APIs. It was written in C and executed in a
client-server mode, so there was network latency. It used an SQL INSERT, not a
bulk load.
http://www.datadirect.com/techres/odbc/docs/wp_odbcvsoci.pdf

The average time to INSERT 2500 rows with ODBC was 23.53 seconds. The minimum
execution time for an SQL SELECT query to return 2500 rows was 0.05 sec.

This single-cpu Java benchmark parsed simpler documents than the Expat test and
the data was closer to the tables in the SQL API test. It took about 2 seconds
to parse 10,000
records using SAX, or about .5 seconds to parse 2500 records.
http://www.devsphere.com/xml/benchmark/method.html

This Python benchmark took between 2.32 and 3.97 seconds for a 3.3 MB document.
http://www.oreillynet.com/pub/wlg/6291

My guess is if these benchmarks were all run on the same testbed under the same
conditions, we'd see a repeatable pattern:

The parsing overhead for loading a database is negligible if we're processing
simple documents, but becomes more significant as the documents increase in
size.








 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS