[
Lists Home |
Date Index |
Thread Index
]
Rick Marshall wrote:
> actually modern compressors don't know about the representation and
> don't much mind. they actually work on the entropy of the message and
> the message as a bit stream - ie they don't know there are tags, ascii
> data, binary, data, schema etc. there's not room to go into it here but
> they will compress a message fairly consistently based on the entropy of
> the message, not the representation. different algorithms are marginally
> better than others (bzip2 vs gzip eg), but seem to give proportionally
> similar results.
There have been a number of compressors designed specifically for XML,
though, that take advantage of the knowledge of the structure of XML
documents to gain some benefits in compression compared to generic
algorithms AT&T Labs' XMill is one example:
http://www.research.att.com/sw/tools/xmill/
Personally, though, I doubt even the best of these justify the added
cost and complexity compared to a generic algorithm like deflate.
--
Elliotte Rusty Harold elharo@metalab.unc.edu
XML in a Nutshell 3rd Edition Just Published!
http://www.cafeconleche.org/books/xian3/
http://www.amazon.com/exec/obidos/ISBN%3D0596007647/cafeaulaitA/ref%3Dnosim
|