xml-dev - Re: [xml-dev] Microsoft FUD on binary XML...

Re: [xml-dev] Microsoft FUD on binary XML...

[ Lists Home | Date Index | Thread Index ]

To: bob@wyman.us
Subject: Re: [xml-dev] Microsoft FUD on binary XML...
From: Dennis Sosnoski <dms@sosnoski.com>
Date: Thu, 20 Nov 2003 15:01:54 -0800
Cc: "'Jeff Lowery'" <Jeff.Lowery@creo.com>, xml-dev@lists.xml.org
In-reply-to: <004201c3af99$e9944010$650aa8c0@BOBDEV>
References: <004201c3af99$e9944010$650aa8c0@BOBDEV>
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030312

Bob Wyman wrote:

>Jeff Lowery wrote:
>  
>
>>What can be achieved by binary XML that can't 
>>be similarly achieved using well-known text 
>>compression algorithms?
>>    
>>
>...
>
>But, be careful when considering examples like the one above. It is
>very easy to provide examples where binary formats compress amazingly
>well. In real life, a binary format *will not always* compress data
>well enough to be worth the trouble and you can't really make the
>claim that a binary format will be faster to encode and decode. In
>general cases, you should probably prefer the text encoding and only
>move to binary if you *know* that it will be useful for your specific
>datasets and processing requirements.
>
I don't know why you say "you can't really make the claim that a binary 
format will be faster to encode and decode". Certainly some binary 
formats are going to be faster to encode and decode than the equivalent 
text representations for any reasonable documents. Just using an 
efficient handle-based approach to element and attribute names will give 
you this, while also reducing the document sizes.

As for the larger issue, I see three different clusters of interest in 
using non-text representations of XML Infosets:

   1. Reducing transport size for general documents - this is handled
      very well by gzip and friends, though at the cost of added
      processing overhead and latency issues
   2. Reducing processing overhead for general documents - this is
      something that my XBIS format (http://www.xbis.org) addresses in
      particular, along with similar schemes; XBIS also gives some
      reduction in document size (which can be substantial, especially
      for something like SOAP and especially when messages are streamed
      over a single logical connection), but the focus is on speed
      rather than size
   3. Schema-enhanced Infoset transfer - typed data exchange, where the
      types are known in advance and the XML representation is in many
      ways incidental; this is where I'd think ASN.1 really hits a sweet
      spot. Because this is based on typed (i.e., binary) data it
      potentially allows better solutions for a particular application
      than general solutions of types (1) and (2).

I think these are all important concerns. I do find it a little baffling 
that so many people recognize (1) as a valid concern and willing endorse 
using gzip transformations of XML documents to address it, while 
refusing to recognize (2) and (3) as valid concerns or accept other 
types of transformations of XML documents.

  - Dennis

Follow-Ups:
- Re: [xml-dev] Microsoft FUD on binary XML...
  - From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
- RE: [xml-dev] Microsoft FUD on binary XML...
  - From: "Bob Wyman" <bob@wyman.us>

References:
- RE: [xml-dev] Microsoft FUD on binary XML...
  - From: "Bob Wyman" <bob@wyman.us>

Prev by Date: Re: [xml-dev] Updating document with SAX
Next by Date: RE: [xml-dev] Microsoft FUD on binary XML...
Previous by thread: RE: [xml-dev] Microsoft FUD on binary XML...
Next by thread: RE: [xml-dev] Microsoft FUD on binary XML...
Index(es):
- Date
- Thread