OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] XML Binary and Compression

[ Lists Home | Date Index | Thread Index ]

Hmm, I'm sorry you don't think schema-based encoding is fair. I find it odd
that you regard schema-based (encoding) compression as lossy. This term is
normally associated with a permanent loss of information. Neither ASN.1 or
MPEG-7 result in the loss of XML content (the original content did not of
course contain the XML schema). The deployment of the schema upon which
encoding/decoding is based in a management issue. There is no need to
transmit it as part of the encoded content.

- Dan

> -----Original Message-----
> From: Elliotte Rusty Harold [mailto:elharo@metalab.unc.edu]
> Sent: Tuesday, March 11, 2003 10:01 AM
> To: winkowski@mitre.org; msc@mitre.org; xml-dev@lists.xml.org
> Cc: winkowski@mitre.org; msc@mitre.org
> Subject: RE: [xml-dev] XML Binary and Compression
> At 11:54 PM -0500 3/10/03, winkowski@mitre.org wrote:
> >On reflection, I don't think that the conclusions reached 
> are all that
> >surprising. Redundancy based compression achieves better 
> results as the file
> >size, and consequently the amount of redundancy, increases. 
> CODECS that take
> >advantage of schema knowledge achieve efficient localized 
> encodings and also
> >need not transmit metadata since this information can be 
> derived at decoding
> >time.
> I may have missed something in your paper then, because I didn't 
> realize you were doing this. If you're assuming that the same schema 
> is available for both compression and decompression, then you're 
> doing a lossy compression. The conmpressed forms of your documents 
> have less information in them than the uncompressed forms. I don't 
> consider that to be a fair or useful comparison with  raw XML with 
> metadata present.
> Then again, maybe that's not what you meant? If you're somehow 
> embedding a schema  in the document you transmit, then it's really 
> just another way of compressing losslessly and that's OK, though In 
> would still require that the schema used for compression be derived 
> from the instance documents rather than applied pre facto under the 
> assumption of document validity. Hmmm, that's not quite right. What I 
> really mean is that given a certain schema it must be possible to 
> losslessly encode both valid and invalid documents.
> -- 
> +-----------------------+------------------------+-------------------+
> | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
> +-----------------------+------------------------+-------------------+
> |           Processing XML with Java (Addison-Wesley, 2002)          |
> |              http://www.cafeconleche.org/books/xmljava             |
> | http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA  |
> +----------------------------------+---------------------------------+
> |  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
> |  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
> +----------------------------------+---------------------------------+


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS