OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Fast text output from SAX?

[ Lists Home | Date Index | Thread Index ]

Elliotte Rusty Harold wrote:

>At 2:48 PM -0700 4/13/04, Dennis Sosnoski wrote:
>>Obviously I can't test every possible internal form that might be 
>>used by an application. However, the vast majority of XML document 
>>processing in Java is currently built on the event streams produced 
>>by SAX parsers. Any general XML format should be convertible to and 
>>from an event stream of this type, and in practice that's the way 
>>any alternative general formats are likely to be used (at least in 
>>the near term).
>That doesn't sound at all plausible to me. If I have a specific 
>internal data structure that I wish to convert to XML, I would never 
>go through SAX. If I really didn't care about performance I might go 
>through XOM or JDOM, but if I cared about performance I'd just dump 
>out the strings or bytes as seemed appropriate. While certainly a few 
>people are using the SAX API to drive output, it's hardly a common 
>thing to do, nor is it at all necessary. I just can't see how the 
>task you want to benchmark corresponds to how XML is used.
It's true that going through a SAX2 event stream is more likely on input 
than on output, and certainly in some cases developers go directly to 
output text (though this causes all the obvious problems with the need 
for converting characters to entities, and in my experience tends to 
sooner or later be replaced by an XML-specific API). I'd say, though, 
that at this point most developers are using either some form of 
document model or data binding for working with their XML data. I know 
from past performance tests that text output generation from both 
document models and data binding is relatively slow. In earlier 
comparisons of writing XMLS as output from JDOM and dom4j the XMLS 
format was about twice as fast as using the text output generation built 
into the document models, and text output from JAXB is even slower 
(averaging about half the speed of dom4j in my tests).

The approach of using the identity XSLT transform through JAXP looks 
good for performance (considerably better than XMLWriter), so that's 
what I'll be using in my comparisons. I suspect it'll turn out that the 
performance penalty for text using this approach is roughly on a par 
with what I saw with the document models.

  - Dennis

Dennis M. Sosnoski
Enterprise Java, XML, and Web Services
Training and Consulting
Redmond, WA  425.885.7197


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS