OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Use cases for parsing efficiency (was Re: [xml-dev]Parsing

[ Lists Home | Date Index | Thread Index ]

Mike Champion wrote:
> As a matter of fact, until a few 
> months ago I was as much a scoffer at the arguments that Al and Robin 
> raise as any of you.

As was I until I stopped thinking that people that used XML in the situations 
where binary infosets are needed were doing something stupid or evil and started 
looking at some real life use cases. If in a system XML works great overall and 
fails on one or two points, it's better to address those points than to throw 
out the baby with the angle brackets.

> My day job colleagues changed my mind by pointing out that in 
> industrial- strength, native XML processing environments, nothing much 
> is happening besides XML being parsed, processed (stored, queried, 
> transformed) and serialized again. (...) I've heard the same 
> thing from industrial-strength SOAP developers -- as the volume of 
> messages goes up and processing resources get dedicated to XML (i.e., no 
> application logic or DB access happening on the machine parsing, 
> processing, serializing the XML), then the bottlenecks in XML parsing 
> become increasingly apparent.

If you have any more or less detailed stories/numbers/examples I'd be happy to 
have them (offlist) to see if they bring up points we haven't covered yet and 
coroborate our feedback and experience with binary SOAP.

> So why should you all care about standardization of processing pipelines 
> that are generally *internal* to products?

Because they're not necessarily internal :) What happens if you want to plug two 
high-performance SOAP implementations together that both use different binary 
infosets? What do standard bodies that include SOAP in their specs and want to 
use binfosets because they are targetting a variety of platforms, some of them 
constrained use as their format? An audio-video MPEG-7 stream contains literally 
tons of metadata (originally XML) how does my SemWeb agent use that to order 
pizza when the finale starts so that I have it right when the film is over?

Binfosets are considered for MMS. That's not very internal :) etc.,etc.

> I'm not completely sure you 
> should.  One might argue that you as customers of / developers for 
> enterprise-class XML processing software may wish to tap into the 
> pipelines at a lower level, e.g. grab the rawest Infoset data out of a 
> DBMS before it gets sanitized and standardized by the API level

If what you want is really high speed processing then it's likely you'll want to 
do that. We have a low-level API (SAXt) and high-level APIs for transparency 
(typically SAX), and the speed difference is very much noticeable.

Robin Berjon <robin.berjon@expway.fr>
Research Engineer, Expway        http://expway.fr/
7FC0 6F5F D864 EFB8 08CE  8E74 58E6 D5DB 4889 2488


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS