xml-dev - Re: [xml-dev] Xqueeze: Compact XML Alternative

Re: [xml-dev] Xqueeze: Compact XML Alternative

[ Lists Home | Date Index | Thread Index ]

To: robin.berjon@expway.fr
Subject: Re: [xml-dev] Xqueeze: Compact XML Alternative
From: "Alaric B. Snell" <alaric@alaric-snell.com>
Date: Fri, 7 Feb 2003 09:59:52 +0000
Cc: xml-dev@lists.xml.org
In-reply-to: <3E437B3E.2080201@expway.fr>
References: <p04330103ba65690e85d6@[192.168.254.4]> <20030206215139.162E35542@calm.warhead.org.uk> <3E437B3E.2080201@expway.fr>

On Friday 07 February 2003 09:24, Robin Berjon wrote:

> > I've always found that compressed binary is smaller than compressed text,
> > as Tahir found. That makes sense logically too; both the binary and text
> > formats have the same CDATA in but the binary format has more compact
> > representations of the elements and so on.
>
> That's not necessarily the case, it very much depends on the binarisation
> process. It is not necessary that both have the same CDATA, especially if
> said CDATA is information available from a schema.

Yep, but in the latter case the benefit above will still occur, plus the 
extra magic of not obfuscating patterns and skewed distributions in the 
underlying data so the compressor can get to work on it!

> > Of course, one could design binary formats which compress badly, but I've
> > never found that they do by default.
>
> I certainly hope that future improvements on our binary format will in fact
> make it compress badly :) That should happen by making it more compact than
> it currently is (while keeping similar speed, which is why compression is
> not always an option).

Nooo! It's not the compression *ratio* that matters here. It's the eventual 
size.

If a binary encoding of 10k gzips to 9k, saving 10%, that's better than a 
textual encoding of the same data at 20k gzipping to 15k, saving 25%!

> It's true however that binary infosets do tend to compress further. In yet
> another benchmark I read yesterday, the smallest results were bin-xml+gz
> and bin-xml+bz2 (well, excluding the same ones with SVG quantize codecs,
> lossy compression of XML documents still scares me ;).

That's not compressing XML any more... it's compressing a higher level data 
model I think! :-)

ABS

-- 
A city is like a large, complex, rabbit
 - ARP

Follow-Ups:
- Re: [xml-dev] Xqueeze: Compact XML Alternative
  - From: Robin Berjon <robin.berjon@expway.fr>

References:
- Re: [xml-dev] Xqueeze: Compact XML Alternative
  - From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
- Re: [xml-dev] Xqueeze: Compact XML Alternative
  - From: Alaric Snell <alaric@alaric-snell.com>
- Re: [xml-dev] Xqueeze: Compact XML Alternative
  - From: Robin Berjon <robin.berjon@expway.fr>

Prev by Date: GML->SGML->XML->?->?->?->?->?->?->...
Next by Date: Re: [xml-dev] Xqueeze: Compact XML Alternative
Previous by thread: Re: [xml-dev] Xqueeze: Compact XML Alternative
Next by thread: Re: [xml-dev] Xqueeze: Compact XML Alternative
Index(es):
- Date
- Thread