[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: "Binary XML" proposals
- From: Danny Ayers <danny@panlanka.net>
- To: Al Snell <alaric@alaric-snell.com>
- Date: Tue, 10 Apr 2001 22:30:32 +0600
<- The main bottleneck my XML-parsing friend here (who writes XML handling
<- code for a Gnomeish project) complains about in practice is that the
<- parser has to malloc a lot of small strings to contain element and
<- attribute names, which he then has to do lots of string compares on; he
<- talks of the parser using a string table, and whenever it encounters a
<- string, reusing the copy from the string table if possible.
That's not a bad idea at all - I'm just about to revise a parser I wrote a
while ago, and I think I'll borrow that ;-)
I'm strictly Java (I'd rather not have a load of mallocs on my mind), but
there *might* be a neat approach to the string cache/compare using inter(),
thank you!
From his
<- feedback, I designed the symbols system in my current semi-proposal, such
<- that element / attribute / namespace names are declared once
<- only and then
<- referenced by integers thereafter. Not only is that a simple form of
<- compression, it'll make it easier in the parser in string handling,
<- meaning it's easier to write a faster parser because the strings are
<- already compacted. This puts the work into the thing that creates the
<- files and away from the reader, but in most cases things will be written
<- once and then read many times, and in the 1:1 case (eg, a SOAP
<- message) it
<- makes no real difference.
Sounds like a good avenue. Though you really ought to do some profiling -
unless amusement keeps you happy.
<- > So don't be so hard on your /dev/hands, Oleg!
<-
<- I suspect that XML::Parser may have been quite finely tuned over the past
<- year or two :-)
Rather a good point.
Any sufficiently advanced technology can be emulated in software
BTW, I like this one ;-) I wasted a lot of time trying to twist it myself
(Mr. Clarke doesn't live all that far away) - best I came up with was "Any
sufficiently advanced technology is indistinguishable from butter."