OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: standard compressed XML format?

[ Lists Home | Date Index | Thread Index ]
  • From: Sebastien Sahuc <ssahuc@imediation.com>
  • To: "Simon St.Laurent" <simonstl@simonstl.com>
  • Date: Thu, 16 Mar 2000 20:13:41 GMT

Have you look at xMill ?

Url : 
-----
http://www.seas.upenn.edu/~liefke/xmill/xmill.html	

Abstract from the page :
------------------------
An Efficient Compressor for XML
(Mirror from AT&T Labs-Research)



XMill is a new tool for compressing XML data efficiently. It is based 
on a
regrouping strategy that leverages the effect of highly-efficient
compression techniques in compressors such as gzip. XMill groups XML 
text
strings with respect to their meaning and exploits similarities 
between
those text strings for compression. Hence, XMill typically achieves 
much
better compression rates than conventional compressors such as gzip.

XML files are typically much larger than the same data represented in 
some
reasonably efficient domain-specific data format. One of the most 
intriguing
results of XMill is that the conversion of proprietary data formats 
into XML
will in fact improve the compression - i.e. the the compressed XML 
file is
(up to twice) smaller than the compressed original file! And this
astonishing compression improvement is achieved at about the same
compression speed.


Regards, 

Sebastien

>>>>>>>>>>>>>>>>>> Original Message <<<<<<<<<<<<<<<<<<

On 3/16/00, 6:33:59 PM, "Simon St.Laurent" <simonstl@simonstl.com> wrote 
regarding standard compressed XML format?:


> Is anyone doing any work on a standard compression format for XML
> documents?

> I'm starting to get concerned about the volume of complaints I'm 
getting
> from readers and folks in Web development forums who are starting to
> argue
> that XML's verbosity is a problem, especially for things like
> transmitting
> vector graphics information.  There are a lot of wasted bits in XML
> documents - and of course in HTML and other text documents as well.

> I'm not happy about the prospect of sending documents to browsers as
> .zip
> or some other compressed format and making users go through multiple
> steps
> to decompress and view the content.  I'd like to think that we could
> come
> up with a compression/decompression algorithm for markup (maybe just
> XML,
> maybe all text) that we can use transparently.  Ideally, it would be 
an
> algorithm explicitly placed in the public domain, avoiding licensing 
and
> legal battles.

> Some folks have argued that this belongs in transfer protocols, while
> others have argued that it should be a 3rd party function, like .zip 
and
> .sit are today.  I'm not convinced by the first because so many
> competing
> formats (gif, jpeg, flash, etc.) already include compression, and I'm
> not
> convinced by the second because I don't think users are willing to
> micromanage such a process.

> It also has an impact on some of the discussions on the IETF-XML-MIME
> discussion (see http://www.imc.org/ietf-xml-mime/ for archives and
> information) because we're already discussing how best to mark
> information
> as XML for possible generic processing.  If a compression standard
> emerged,
> it might well have an impact on MIME types - and I'd like to see that
> discussion start before we settle the MIME types for XML debate.

> Any thoughts?  I like the fact that XML is verbose when I'm editing 
and
> processing, but it's not so good in transmission.  I'd like to think
> that
> there's a good _general_ solution that will let us have the best of 
both
> worlds.

> Simon St.Laurent
> XML Elements of Style / XML: A Primer, 2nd Ed.
> Building XML Applications
> Inside XML DTDs: Scientific and Technical
> Cookies / Sharing Bandwidth
> http://www.simonstl.com

> 
************************************************************************
> ***
> This is xml-dev, the mailing list for XML developers.
> To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
> List archives are available at http://xml.org/archives/xml-dev/
> 
************************************************************************
> ***




***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS