OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: "Binary XML" proposals

On Tue, 10 Apr 2001, Danny Ayers wrote:

> Put this way XML parsing does sound complex - but all you're really doing is
> matching a series of single characters and changing the state of some
> variables accordingly, storing some of the characters as needed. When a
> certain state is reached you call a handler. In processing terms it don't
> really have to do that much - certainly in the same league as Oleg's parser.

The main bottleneck my XML-parsing friend here (who writes XML handling
code for a Gnomeish project) complains about in practice is that the
parser has to malloc a lot of small strings to contain element and
attribute names, which he then has to do lots of string compares on; he
talks of the parser using a string table, and whenever it encounters a
string, reusing the copy from the string table if possible. From his
feedback, I designed the symbols system in my current semi-proposal, such
that element / attribute / namespace names are declared once only and then
referenced by integers thereafter. Not only is that a simple form of
compression, it'll make it easier in the parser in string handling,
meaning it's easier to write a faster parser because the strings are
already compacted. This puts the work into the thing that creates the
files and away from the reader, but in most cases things will be written
once and then read many times, and in the 1:1 case (eg, a SOAP message) it
makes no real difference.

> So don't be so hard on your /dev/hands, Oleg!

I suspect that XML::Parser may have been quite finely tuned over the past
year or two :-)


                               Alaric B. Snell
 http://www.alaric-snell.com/  http://RFC.net/  http://www.warhead.org.uk/
   Any sufficiently advanced technology can be emulated in software