xml-dev - Re: [xml-dev] XML-enabled databases, XQuery APIs

Re: [xml-dev] XML-enabled databases, XQuery APIs

[ Lists Home | Date Index | Thread Index ]

To: <xml-dev@lists.xml.org>
Subject: Re: [xml-dev] XML-enabled databases, XQuery APIs
From: "Ken North" <kennorth@sbcglobal.net>
Date: Tue, 19 Apr 2005 00:44:25 -0700
References: <200504181843.j3IIhFvI097554@www9.cruzio.com> <42649218.5010404@rpbourret.com>

> Michael Kay wrote:
> > I would be very surprised if XML parsing contributes anything noticeable to
> > the cost of a database load (in shredding mode).

These benchmarks were run on different testbeds so this isn't an
apples-to-apples comparison.

This parser performance test of Expat (C, SAX) reported a best time of 0.05 sec
to parse an 884K document with 32K nodes. For 2500 documents, that would be
approx. 125 seconds.
http://okmij.org/ftp/Scheme/SSAX-benchmark-1.html

This benchmark compared two SQL APIs. It was written in C and executed in a
client-server mode, so there was network latency. It used an SQL INSERT, not a
bulk load.
http://www.datadirect.com/techres/odbc/docs/wp_odbcvsoci.pdf

The average time to INSERT 2500 rows with ODBC was 23.53 seconds. The minimum
execution time for an SQL SELECT query to return 2500 rows was 0.05 sec.

This single-cpu Java benchmark parsed simpler documents than the Expat test and
the data was closer to the tables in the SQL API test. It took about 2 seconds
to parse 10,000
records using SAX, or about .5 seconds to parse 2500 records.
http://www.devsphere.com/xml/benchmark/method.html

This Python benchmark took between 2.32 and 3.97 seconds for a 3.3 MB document.
http://www.oreillynet.com/pub/wlg/6291

My guess is if these benchmarks were all run on the same testbed under the same
conditions, we'd see a repeatable pattern:

The parsing overhead for loading a database is negligible if we're processing
simple documents, but becomes more significant as the documents increase in
size.

Follow-Ups:
- Re: [xml-dev] XML-enabled databases, XQuery APIs
  - From: David Lyon <david.lyon@computergrid.net>

References:
- Re: [xml-dev] XML-enabled databases, XQuery APIs
  - From: Ronald Bourret <rpbourret@rpbourret.com>

Prev by Date: Re: [xml-dev] XML-enabled databases, XQuery APIs
Next by Date: RE: [xml-dev] XML-enabled databases, XQuery APIs
Previous by thread: Re: [xml-dev] XML-enabled databases, XQuery APIs
Next by thread: Re: [xml-dev] XML-enabled databases, XQuery APIs
Index(es):
- Date
- Thread