xml-dev - Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text ou

Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text ou

[ Lists Home | Date Index | Thread Index ]

To: XML Developers List <xml-dev@lists.xml.org>
Subject: Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
From: David Megginson <dmeggin@attglobal.net>
Date: Mon, 19 Apr 2004 19:28:01 -0400
In-reply-to: <40844422.5010801@lig.net>
References: <15725CF6AFE2F34DB8A5B4770B7334EE03F9F659@hq1.pcmail.ingr.com> <5F1BB722-920D-11D8-A3E3-000A95CCC59E@xegesis.org> <4083E7A0.90807@attglobal.net> <40844422.5010801@lig.net>
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5

Stephen D. Williams wrote:

> Processing overhead, including the major components of parsing / object 
> creation / data copies / serialization, is not a 'future problem'.  It 
> has always been a problem.

We don't know how much and what kind of a problem XML will be until we've
had time to gain experience -- if we try to optimize too early, we'll end up
optimizing the wrong thing.

For example, I set up a test for a customer a while back to see how fast
Expat could parse documents.  On my 900 MHz Dell notebook, with 256MB RAM
and Gnome, Mozilla, and XEmacs competing for memory and CPU, Expat could
parse about 3,000 1K XML documents per second (if memory does not fail me).
  If I had tried to, say, build DOM trees from that, I expect that the number
would have fallen into the double digits (in C++) or worse.  In this case,
obviously, there would be far more to be gained from optimizing the code on
the other side of the parser (say, by implementing a reusable object pool or
lazy tree building) than there would be from replacing XML with something
that parsed faster.

I have never benchmarked SOAP implementations, so I have no idea how well
they perform, but my Expat datapoint suggests that XML parsing is unlikely
to be the bottleneck.  In fact, you might be able to gain more by writing an
optimized HTTP library that fed content as a stream rather than doing an
extra buffer copy.

> The scarce resource is time.  Anything that eats time is bad.  This could
> be bandwidth usage, CPU, memory, or suboptimal communication and semantic
>  models.

I have some experience with high-volume, high-speed systems as well.  They
tend to be so finely hand-tuned that they couldn't use *any* off-the-shelf
format or protocol, much less XML or SOAP -- even HTTP (or in some cases,
TCP) is out of the question.  These are the kinds of people who will use
deltas to avoid wasting four bytes on every number.

All the best,

David

Follow-Ups:
- Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
  - From: "Stephen D. Williams" <sdw@lig.net>

References:
- RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
  - From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
- Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
  - From: Michael Champion <mc@xegesis.org>
- Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
  - From: David Megginson <dmeggin@attglobal.net>
- Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
  - From: "Stephen D. Williams" <sdw@lig.net>

Prev by Date: RE: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
Next by Date: xml and VC++
Previous by thread: Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
Next by thread: Re: [xml-dev] Validation vs performance - was Re: [xml-dev] Fast text output from SAX?
Index(es):
- Date
- Thread