OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Compiled XML

[ Lists Home | Date Index | Thread Index ]

Glad to see such an, err, "enthusiastic" response. As the web page says, 
I'd intended to update this long ago. I've been sidetracked but will try 
to get back to it later this month, when I want to compare document size 
and processing speed for collections of documents using a common schema. 
I'll also try to find the fastest available SAX2 parser to use as an 
input-only comparison.

I'd suggest you don't waste time trying Java serialized versions of DOM 
- the results are horrible. You can see some at the bottom of my 
document models benchmarks page, at 
http://www.sosnoski.com/opensrc/xmlbench/results.html. The main problem 
is that all the document representations (DOM, JDOM, dom4j, etc.) are 
tree structures of generally small objects, while Java serialization is 
optimized for graph structures. It uses (fairly large) handles for each 
object, and actually includes the handles in the encoding (as opposed to 
just making the values sequential and implicit). This adds a lot of 
bloat - Java serialized Xerces DOM ran about twice the size of the text 
documents in the tests I've run.

  - Dennis

Alaric Snell wrote:

> - uuugghh, I just ejaculated (sorry, ladies)!
>That's the kind of experiment I was planning to perform this weekend, and the 
>kinds of results I imagined getting.
>The only difference is that I'd introduce gzipped versions of the text, 
>serialised DOM tree, and XMLS data, including the time taken to deflate and 
>inflate the data. Just since people keep raising gzipped text.
>I'll try and do that this weekend...


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS