OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Compiled XML

[ Lists Home | Date Index | Thread Index ]

On Wednesday 27 March 2002 12:35, you wrote:

> Sorry, I was using "gzip" in a vague way to represent modern compression
> libraries.  My assertion -- which is just my sense of
> the previous discussions, not a competent professional opinion --
> was that it's unlikely that a "binary XML" scheme would compress
> data significantly better than an off the shelf text compression algorithm.

They're orthogonal... binary storage of XML is about using less bits for the 
structure and for storing integers. Compressing a file is something that can 
be done to both textual and binary XML files! Compression is about efficient 
storage of redundant strings of bits, it's not about 'text' at all. 
Executable code compresses quite well, in fact.

Pedants will be eager to point out that in executable files, the executable 
code is often referred to as 'text' anyway. Yes, I know! Shut up!

Compressing textual-XML will not get as good a compression percentage as 
compressing binary-XML since there is more redundancy in the textual XML - 
and since gzip doesn't know that the element name in a closing element tag is 
redundant it will faithfully record (for each and every one) that it should 
contain that string, even if it refers to the string by a sliding window 
reference. The difference between gzipped TXML and gzipped BXML will not be 
large if the underlying XML is mainly text anyway like XHTML or Docbook, but 
it will make more of a difference if the underlying XML is actually something 
like XSLT or XSD or SOAP that's mainly elements and attributes and numbers.

So comparing gzipped TXML to plain BXML isn't particularly fair! Not least of 
which because the gzipping involves quite vast CPU and memory costs which the 
BXML does not. The BXML parser involves less CPU/memory cost than a TXML 
parser.

Compare plain TXML with plain BXML. Compare gzipped TXML with gzipped BXML. 
Please stop comparing gzipped TXML with plain BXML, everyone!!!!

Grrr... I'll write a simple binary XML transcoder this weekend and run some 
tests, both with and without gzipping the result, OK?

ABS

-- 
                               Alaric B. Snell
 http://www.alaric-snell.com/  http://RFC.net/  http://www.warhead.org.uk/
   Any sufficiently advanced technology can be emulated in software  




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS