XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] XML parsing @ 100MB-1000MB/sec/GHz with Parallel Bit Streams

Thanks for that report, John.

For research purposes, my present work with parabix is focused
entirely on CPU time for parsing once the data is available.    This is 
where the parallel bit stream methods make a difference.   Of course, 
the I/O will have to be optimized at some point.

On the memory usage front, we are presently using a big slurp
to ease the research work.   However, much of the design is
organized around a streaming model.   

> Hi Rob,
> 
> Those are very impressive figures! I downloaded your parser and did a
> quick test to compare it to Expat parsing a 1.1Gb XML file:
> 
> Expat: 21.0s (wallclock), 18.2s (user time)
> Parabix: 21.7s (wallclock), 4.0s (user time)
> 
> I used the "markup_stats" program that came with Parabix. Clearly
> Parabix is spending less time with heavy CPU load (user time), but it
> still takes longer to parse when disk IO is included (wallclock time).
> Parabix also seems to take far more memory than Expat - proportional to
> the size of the document?
> 
> Can the IO and memory usage in "markup_stats" be improved, or is this an
> intrinsic problem with your approach to XML parsing?
> 
> John
> 
-- 
Robert D. Cameron, Ph.D.
Professor of Computing Science
Simon Fraser University



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS