[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Request: Techniques for reducing the size of XML instances
- From: Olivier DUBUISSON <Olivier.Dubuisson@francetelecom.com>
- To: "HUGHES,MARK (Non-HP-FtCollins,ex1)" <firstname.lastname@example.org>
- Date: Fri, 27 Jul 2001 09:36:35 +0200
"HUGHES,MARK (Non-HP-FtCollins,ex1)" wrote:
> >From: Michael Brennan [mailto:Michael_Brennan@allegis.com]
> >Plain old vanilla gzip compression works great. If transmitting XML over
> >(a very common use case), the HTTP spec explicitly permits compression of
> >content. You can include "Content-Encoding: gzip" as an HTTP header, and
> >achieve a high-level of compression (80%-90% in my experience) while still
> >fully conforming to the HTTP spec. The only downside is that many XML
> >messaging toolkits may not properly support this.
> Seconded. gzip is simple, fast, ubiquitous, standard, and gives you far
> better compression than any binary substitution scheme ever will.
Tests that we have performed show that the PER encoding rules of ASN.1
<http://asn1.elibel.tm.fr/xml/#schema-mapping> gives (on a lot of cases, and
particularly on small instances such as WML) better compression.
I promissed to put benchmarks and measure comparisons on the website but
had no time to do it :-(
> After all, gzip compresses both the tags *and* the content, and can
> identify repeated sequences of <tag>content</tag>. Binary encodings
> can only compress the tags...
It seems you're equating binary encodings with Binary XML.
There are binary encodings (such as PER) which produce a compact encoding
for the data (and which do not really encode the tags).
When the types are constrained (or when the constraints defined in an XML
Schema are reused on the ASN.1 side), the encoding is even more compact.
france telecom R&D
_ DTL/MSV - 22307 Lannion Cedex - France
( ) tel: +33 2 96 05 38 50 - fax: +33 2 96 05 39 45
\_/\ Site ASN.1 : http://asn1.elibel.tm.fr/