xml-dev - Re: [xml-dev] Use cases for parsing efficiency (was Re: [xml-dev] Parsin

Re: [xml-dev] Use cases for parsing efficiency (was Re: [xml-dev] Parsin

[ Lists Home | Date Index | Thread Index ]

To: <xml-dev@lists.xml.org>
Subject: Re: [xml-dev] Use cases for parsing efficiency (was Re: [xml-dev] Parsing efficiency? - why not 'compile'????)
From: "Matt" <matt@elyt.com>
Date: Thu, 27 Feb 2003 09:58:55 +1300
References: <OF047F14BF.0A3C2919-ONCA256CD8.00031C17@facs.gov.au> <15962.49099.318137.237747@megginson.com> <3E5BDD04.70105@yahoo.de> <15963.57936.208121.931029@megginson.com> <oprk59i5y1ezizxn@smtp.comcast.net> <15963.65069.422647.95309@megginson.com> <3E5C9518.1050402@expway.fr> <15964.46767.897161.524921@megginson.com> <oprk7i1rxzezizxn@smtp.comcast.net> <3E5D0394.60707@expway.fr>

Seems to me that the bottle neck is not IO, or network, for those people
should be talking about compression.  The bottleneck seems to be parsing the
XML back into a set of useable objects.  Apart from reasonably standard
forms such as SAX processing and DOM representations, there are various
other forms of storing deserialized XML.  Many of these use SAX underneath,
so if this is your bottleneck, then persist a binary form of the events
given - say as a vector of event objects, and reuse this.  Perhaps
persisting a binary form of a DOM tree is what you want.

In the end, I don't think the argument is about a binary form of XML, it is
about standard binary forms of deserialized XML.  The problem with that is
that you aren't talking about XML anymore :-)



----- Original Message -----
From: "Robin Berjon" <robin.berjon@expway.fr>
To: "Mike Champion" <mc@xegesis.org>
Cc: <xml-dev@lists.xml.org>
Sent: Thursday, February 27, 2003 7:12 AM
Subject: Re: [xml-dev] Use cases for parsing efficiency (was Re: [xml-dev]
Parsing efficiency? - why not 'compile'????)


> Mike Champion wrote:
> > As a matter of fact, until a few
> > months ago I was as much a scoffer at the arguments that Al and Robin
> > raise as any of you.
>
> As was I until I stopped thinking that people that used XML in the
situations
> where binary infosets are needed were doing something stupid or evil and
started
> looking at some real life use cases. If in a system XML works great
overall and
> fails on one or two points, it's better to address those points than to
throw
> out the baby with the angle brackets.
>
> > My day job colleagues changed my mind by pointing out that in
> > industrial- strength, native XML processing environments, nothing much
> > is happening besides XML being parsed, processed (stored, queried,
> > transformed) and serialized again. (...) I've heard the same
> > thing from industrial-strength SOAP developers -- as the volume of
> > messages goes up and processing resources get dedicated to XML (i.e., no
> > application logic or DB access happening on the machine parsing,
> > processing, serializing the XML), then the bottlenecks in XML parsing
> > become increasingly apparent.
>
> If you have any more or less detailed stories/numbers/examples I'd be
happy to
> have them (offlist) to see if they bring up points we haven't covered yet
and
> coroborate our feedback and experience with binary SOAP.
>
> > So why should you all care about standardization of processing pipelines
> > that are generally *internal* to products?
>
> Because they're not necessarily internal :) What happens if you want to
plug two
> high-performance SOAP implementations together that both use different
binary
> infosets? What do standard bodies that include SOAP in their specs and
want to
> use binfosets because they are targetting a variety of platforms, some of
them
> constrained use as their format? An audio-video MPEG-7 stream contains
literally
> tons of metadata (originally XML) how does my SemWeb agent use that to
order
> pizza when the finale starts so that I have it right when the film is
over?
>
> Binfosets are considered for MMS. That's not very internal :) etc.,etc.
>
> > I'm not completely sure you
> > should.  One might argue that you as customers of / developers for
> > enterprise-class XML processing software may wish to tap into the
> > pipelines at a lower level, e.g. grab the rawest Infoset data out of a
> > DBMS before it gets sanitized and standardized by the API level
>
> If what you want is really high speed processing then it's likely you'll
want to
> do that. We have a low-level API (SAXt) and high-level APIs for
transparency
> (typically SAX), and the speed difference is very much noticeable.
>
> --
> Robin Berjon <robin.berjon@expway.fr>
> Research Engineer, Expway        http://expway.fr/
> 7FC0 6F5F D864 EFB8 08CE  8E74 58E6 D5DB 4889 2488
>
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>
>
>

References:
- Parsing efficiency? - why not 'compile'????
  - From: Matthew.Bennett@facs.gov.au
- re: [xml-dev] Parsing efficiency? - why not 'compile'????
  - From: David Megginson <david@megginson.com>
- Re: [xml-dev] Parsing efficiency? - why not 'compile'????
  - From: "J.Pietschmann" <j3322ptm@yahoo.de>
- Re: [xml-dev] Parsing efficiency? - why not 'compile'????
  - From: David Megginson <david@megginson.com>
- Re: [xml-dev] Parsing efficiency? - why not 'compile'????
  - From: Mike Champion <mc@xegesis.org>
- Re: [xml-dev] Parsing efficiency? - why not 'compile'????
  - From: David Megginson <david@megginson.com>
- Re: [xml-dev] Parsing efficiency? - why not 'compile'????
  - From: Robin Berjon <robin.berjon@expway.fr>
- Re: [xml-dev] Parsing efficiency? - why not 'compile'????
  - From: David Megginson <david@megginson.com>
- Use cases for parsing efficiency (was Re: [xml-dev] Parsingefficiency? - why not 'compile'????)
  - From: Mike Champion <mc@xegesis.org>
- Re: [xml-dev] Use cases for parsing efficiency (was Re: [xml-dev]Parsing efficiency? - why not 'compile'????)
  - From: Robin Berjon <robin.berjon@expway.fr>

Prev by Date: RE: [xml-dev] Registered Namespace prefixes
Next by Date: RE: [xml-dev] Use cases for parsing efficiency (was Re: [xml-dev] Parsing efficiency? - why not 'compile'????)
Previous by thread: Re: [xml-dev] Use cases for parsing efficiency (was Re: [xml-dev]Parsing efficiency? - why not 'compile'????)
Next by thread: Re: [xml-dev] Parsing efficiency? - why not 'compile'????
Index(es):
- Date
- Thread