OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Re: [ubl-dev] Top 10 uses of XML in 2007

Sorry to press you on this but it has a lot of implications for
'high performance' (and pretty expensive) systems. Not that
the small matter of office files has much to do with that but
I'd like to test your logic a bit.

Take an Excel file. Convert it to Open Office. Then zip both and
compare the sizes. They are roughly equal. The binary in the
Excel when uncompressed is larger than it is when compressed
whereas the open office file is already compressed so zipping
it makes little difference.

Not sure of your logic here but mine is:

The compression converts XML to a smaller size of file than
a part XML, part binary file which isn't compressed. Correct?
And ceratin other binary files when compressed are not
particularly reduced in size because the compression cannot
improve much on their existing compactness when mere
zipping is concerned (e.g. jpg files, etc). So zip compresses
text a lot but binary not very much, yes? But some forms of
binary perhaps more than others.

So would I send a large amount of XML without first zipping it?
Not if I was sending by email say. Most use zip in these cases.
If sending by some protocol that compresses 'on the wire'
I might be prepared to just send text if there was no better
alternative and the compression that was on the 'wire' was
optimal for the purpose. But often it isn't because it is anaware
of the format it is compressing. Knowing the format allows
much better compression, hence ASN.1 and it's use of mapping
to a schema. Yes?

Not sure where there is any faulty logic there? 

Saying the XML gets compressed on the wire anyway just adds
to my logic that compression and therefore binary are preferable
on the wire. But I'd take it further and say they are better for
applications too: just ceratin circumstances where the binary
is a real pain (hence the rise of XML I guess)but would that see
folk throwing away their daatabase in binary and just using the
file system. Unless of course
1. the file system compresses anyway
2. there is the option of improved compression in the file system
due to knowing the schema when oneis available

I really don't see where there is any fault in this logic. Or any lack
of hard facts.

All the best


>>> Elliotte Harold <elharo@metalab.unc.edu> 19/02/07 17:32:34 >>>
Stephen Green wrote:
> Hi Elliotte
> Can any of these arguments be verified?

Yes, and I've published measurements in the past as have others.

> " (Compare OpenDocument to the 
> equivalent Microsoft Office binary, for example.)"
> isn't that specious? The OO file is zipped and both are XML
> for the main part, to varying degrees turned into binary
> (depending on which version of the MS office file you mean).

I specifically said the Microsoft Office binary format. I have not 
tested the new Microsoft XML format so I don't know how it is size wise. 
However, their old binary formats are larger than the OpenOffice 
formats. Just open any Word file in OpenOffice, save it in OpenOffice, 
unzip it, and compare. You can see for yourself.

*Elliotte Rusty Harold  elharo@metalab.unc.edu 
Java I/O 2nd Edition Just Published!

Please note the new simpler name for our website: http://www.bristol.gov.uk

Our email addresses have also changed - visit http://www.bristol.gov.uk/bigchange for further details.

Sign-up for our email bulletin giving news, have-your-say  and event information at: http://www.bristol.gov.uk/newsdirect 

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS