OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: "Binary XML" proposals




<- The main bottleneck my XML-parsing friend here (who writes XML handling
<- code for a Gnomeish project) complains about in practice is that the
<- parser has to malloc a lot of small strings to contain element and
<- attribute names, which he then has to do lots of string compares on; he
<- talks of the parser using a string table, and whenever it encounters a
<- string, reusing the copy from the string table if possible.

That's not a bad idea at all - I'm just about to revise a parser I wrote a
while ago, and I think I'll borrow that ;-)
I'm strictly Java (I'd rather not have a load of mallocs on my mind), but
there *might* be a neat approach to the string cache/compare using inter(),
thank you!


 From his
<- feedback, I designed the symbols system in my current semi-proposal, such
<- that element / attribute / namespace names are declared once
<- only and then
<- referenced by integers thereafter. Not only is that a simple form of
<- compression, it'll make it easier in the parser in string handling,
<- meaning it's easier to write a faster parser because the strings are
<- already compacted. This puts the work into the thing that creates the
<- files and away from the reader, but in most cases things will be written
<- once and then read many times, and in the 1:1 case (eg, a SOAP
<- message) it
<- makes no real difference.

Sounds like a good avenue. Though you really ought to do some profiling -
unless amusement keeps you happy.

<- > So don't be so hard on your /dev/hands, Oleg!
<-
<- I suspect that XML::Parser may have been quite finely tuned over the past
<- year or two :-)

Rather a good point.



   Any sufficiently advanced technology can be emulated in software

BTW, I like this one ;-) I wasted a lot of time trying to twist it myself
(Mr. Clarke doesn't live all that far away) - best I came up with was "Any
sufficiently advanced technology is indistinguishable from butter."