OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] XML Performance in a Transacation

[ Lists Home | Date Index | Thread Index ]

On Mar 23, 2006, at 7:41 PM, Rick Jelliffe wrote:

> Michael Kay wrote:
>
>>> My expectation is that XML parsing can be significantly sped up  
>>> with ...
>>>
>>
>> I think that UTF-8 decoding is often the bottleneck and the  
>> obvious way to
>> speed that up is to write the whole thing in assembler. I suspect  
>> the only
>> way of getting a significant improvement (i.e. more than a  
>> doubling) in
>> parser speed is to get closer to the hardware. I'm surprised no- 
>> one has done
>> it. Perhaps no-one knows how to write assembler any more (or  
>> perhaps, like
>> me, they just don't enjoy it).
>>
> Yes.* The technique using C++ intrinsics (which is assembler in  
> disguise) I gave in my blog (URL in previous post) gives a *four to  
> five* times speed increase compared to fairly tight C++ code, for  
> the libxml utf-8 to UTF-16 transcoder,  for ASCII valued data.

In bnux binary XML, UTF-8 transcoding to Java strings typically  
accounts for about 20-50% of parsing at overall throughput of 50 -  
400 MB/s [1]. This is even though the conversion routines are highly  
optimized, taking full advantage of pure or partial ASCII valued  
data, similar in spirit to the technique your blog mentions (except  
that it's in Java). I do have some hope that future VMs with better  
dynamic optimization logic for memory prefetching, bulk operations,  
etc. could make more of a difference here, though. Care to explain  
why a dynamic optimizer couldn't get close to what those handcoded  
assembler routines do, in particular considering modern memory  
latencies?

On the standard textual XML front: As has been noted, Xerces and  
woodstox can be made to run quite fast, but in practise, few people  
know how do configure them accordingly, and to do so reliably, and  
without conformance compromises.

Overall, configuring textual XML toolkits to reach high levels of  
performance often requires substantial time and expertise. For  
example, to minimize startup and initialization time for each test,  
the same parser/serializer object instance should be reused (pooled),  
at least in the presence of many small messages. In our experience,  
most text XML models perform poorly out-of-the-box. Thus, we expect  
most real world applications to perform significantly worse than  
shown in our experiments, perhaps dramatically so. Real observed  
performance is not only a function of capability, but also of  
accessability. Most users would be better off if XML parsers would  
perform well out-of-the-box, or be self-tuning. Most users can't  
afford to study the complex reliability vs. performance interactions  
of myriads of more or less static tuning knobs.

[1] http://www.gridforum.org/GGF15/presentations/wsPerform_hoschek.pdf

Wolfgang.





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS