OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] Data streams

[ Lists Home | Date Index | Thread Index ]

As I said initially, larger data elements do change the ratios. To go to the
opposite extreme, large blocks of text can actually be handled MORE
efficiently with XML than CSV.

On the other hand, the larger the attributes and other tag "labels," the
greater the ratio, and visa versa.

So, all I'm saying is that there are times when XML make more sense than
CSV, and certain situations make CSV superior. No one solution is right for
all circumstances. 

By choosing the method that fits most sensibly with the data will help
alleviate some of the XML backlash. A good seems to be that, everything else
being equal, (a) the longer the tags or the shorter the data elements, the
less sense it makes to transport the data via XML and (b) the shorter the
tags or the longer the data elements, the more sense it makes to transport
the data via XML. Anyone disagree?


-----Original Message-----
From: Peter Hunsberger [mailto:peter.hunsberger@gmail.com] 
Sent: Monday, December 06, 2004 5:24 PM
To: Stephen E. Beller
Cc: xml-dev@lists.xml.org
Subject: Re: [xml-dev] Data streams

On Mon, 06 Dec 2004 16:35:48 -0500, Stephen E. Beller <sbeller@nhds.com>
> In consideration of Elliotte's reply, I went back and looked at the XML
> Excel generated. Here's what I found ...
> Every one of the XML data elements had this tagging structure:
> <Row>
>    <Cell><Data ss:Type="Number">1</Data></Cell>
> </Row>
> In contrast, the CSV had this structure: 1,
> That's a 50 characters to 1 difference for each data element.
> I doubt that all those XML tags are necessary if you're rendering the data
> in something other than a spreadsheet. But if you are planning to use a
> spreadsheet, then the 50 to 1 ratio is valid, it seems to me.

Use the number 10, now the difference is 51 to 2 or a ratio of ~26 to
1.  Use the number 100 and the ratio is 52 to 3 or ~17 to 1.  Six
digits? 56 to 6 or ~10 to 1. Now add multiple columns of data (as any
realistic example would do) and the ratio falls even farther.

> So, this benchmark test still points to a huge difference in file size and
> in unzipping and parsing time when you compare a large data array in CSV
> compared to XML.

Maybe, maybe not, the bench mark needs to be more realistic before you
draw any conclusions about "huge".

Peter Hunsberger


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS