Lists Home |
Date Index |
Alaric Snell wrote:
> On Thursday 06 February 2003 11:58 am, you wrote:
>>It doesn't necessarily (or even generally) work that way - compact
>>binary formats don't generally compress down as well as text, so you end
>>up with size(text) > size(binary) > size(compressed-binary) >
> I've always found that compressed binary is smaller than compressed text, as
> Tahir found. That makes sense logically too; both the binary and text formats
> have the same CDATA in but the binary format has more compact representations
> of the elements and so on.
That's not necessarily the case, it very much depends on the binarisation
process. It is not necessary that both have the same CDATA, especially if said
CDATA is information available from a schema.
> Of course, one could design binary formats which compress badly, but I've
> never found that they do by default.
I certainly hope that future improvements on our binary format will in fact make
it compress badly :) That should happen by making it more compact than it
currently is (while keeping similar speed, which is why compression is not
always an option).
It's true however that binary infosets do tend to compress further. In yet
another benchmark I read yesterday, the smallest results were bin-xml+gz and
bin-xml+bz2 (well, excluding the same ones with SVG quantize codecs, lossy
compression of XML documents still scares me ;).
Robin Berjon <firstname.lastname@example.org>
Research Engineer, Expway http://expway.fr/
7FC0 6F5F D864 EFB8 08CE 8E74 58E6 D5DB 4889 2488