[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] No XML Binaries? Buy Hardware
- From: noah_mendelsohn@us.ibm.com
- To: Elliotte Harold <elharo@metalab.unc.edu>
- Date: Fri, 23 Feb 2007 16:31:39 -0500
Elliotte Harold writes:
> I don't think we've hit the limits of parser performance yet,
I think that asking "what are the theoretical limits" is very important,
and it's not a question I've seen discussed often enough. From our paper
[1]:
> No parser can process input faster than its supporting hardware
> accesses data, but the additional cost of parsing and
> validation should be minimized. On a 1 GHz Pentium processor a
> simple character-scanning loop runs at about 100 Mbytes/second,
> which is 10 cycles/byte.
> [..]
> On the tests reported in this paper, using the business object
> API typical of Web Services applications, XML Screamer parses
> and schema-validates XML at between 23 and 46 Mbytes/sec/GHz;
> XML Screamer can thus process XML at speeds of roughly 100–200
> Mbytes/sec on the 4 GHz processors now becoming available.
> [...]
> Using its business object APIs, XML Screamer scans, parses,
> validates and deserializes at between 22% and 44% of the tested
> processor's raw character scanning speed. Except insofar as
> ways can be found to use such processors more efficiently, e.g.
> by exploiting hardware string test instructions or on chip SIMD
> accelerators, gains from further tuning or alternative
> approaches are likely to be modest. XML Screamer's performance
> is probably not far from the maximum achievable.
In short, we observed that to check well formedness, a parser must at
least touch each input character. You can benchmark various
processor/memory combinations using their most optimized forms of
character and string comparison and find out how fast they can inspect
each byte of an input buffer, doing the sorts of character comparisons
necessary for well formedness checking. There may be ways to do better
than we did on particular processors, but I think it's interesting that
one can set a pretty good bound on how fast XML processing can go.
Furthermore, I think our work shows that it is possible to get not to far
from that bound, for some definition of "not to far" :-).
Noah
[1] http://www2006.org/programme/item.php?id=5011
--------------------------------------
Noah Mendelsohn
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]