Lists Home |
Date Index |
Elliotte Harold wrote:
> Robin Berjon wrote:
>> I wonder how that statement is supposed to hold in the face of the
>> fact that dozens of "binary XML" formats out there are systematically
>> smaller than the equivalent XML. There are indeed interesting cases,
>> for instance when you see that a gzipped SVG document is often circa
>> 30% smaller than the SWF with the same functionality, but they hardly
>> make the rule.
> My point is not that theoretical, academic exercises don't produce
> smaller files. My point is that if you look at the real-world, non-XML
> file formats people actually use on a day-to-day basis, they tend to
> be quite bloated relative to XML. If size did matter, then people
> would be complaining about the bloatedness of their Word files, their
> Quicken files, their database tables, and so forth. That they're not
> complaining about these things proves that they don't really care
> about them.
or they just accept them. the number of times i've been instructed to
use word files inappropriately or worse make sure everything web based
can run on ie are examples, not of accepting size, but accepting often
ignorant client requests. ever tried to win the argument where a
newsletter must go our in word format, but the receptionist who created
it knows nothing about word stores images and the document is now
20mbytes in size. it's sent indvidually to 1000 customers, filling up
the mail server (instead of sending to the customer list alias) and at
any rate most of the customers have 1mbyte mailbox limits so they all
bounce back. and then you're told to fix it without educating the
receptionist (it's a technical problem surely because all the other mail
goes through!) and still using word formats.
my point is that size / bloatedness(sic) does matter and does break even
desktop systems. it's a matter of not knowing that every day systems
fail, but noone tackles the problems because, as i've said before,
people have a great ability to get by.
so back to the point - there was plenty of bloated text formats and
binary formats before xml and there will be plenty after it. xml isn't
the only solution either. however, xml 1.0 is a well accepted standard
that does it's job well. the proliferation of useful vocabularies is i
guess the proof of this.
even though i have heard of some excellent use cases for "binary xml"
i'm still not sure that technology won't fix that.
gee i remember when we used to debate whether digital could ever really
replace analog - i mean you need 3 times the bandwidth for digital - but
it has - ubiquitously. will text only replace binary? yes. it's inevitable.
so bloatedness does matter, but is xml bloated? no. does it have
redundant information? yes, because it can be compressed. now much?
depends on your tag design mostly. if you're really worried use <a>, <b>
etc for your tags.
what wins for xml is the tools and the redevlopment and maintenance
effort for binary xml will kill one or the other.
so here's a real problem - will things like xslt have to cope with both
formats? will tools have to include binary encoders/decoders? etc. these
are the real questions. will binary xml tools be exempt from processing
text, but text xml tools won't be? where's the advantage if both tool
sets have to cope with both formats?
> The only reason people are so excited about the verboseness and
> bloatedness of XML is that, because it's a plain text format, they can
> see just how much redundancy there is. The other binary files they
> deal with range from just as bloated to even more bloated, but because
> they can't see it, they don't care. This wasn't always true. by the
> way. In the days of floppy disks and 10 MB hard drives, software
> vendors did care about binary file sizes, and software reviewers and
> customers paid attention to these things, but it just hasn't been
> worth the effort for at least ten years.
> Here's a prediction: if a binary XML format does come to pass, within
> five years of its inception, these binary documents will be large,
> bloated, full of empty space and redundant content, and nobody will
> notice or care.
> There are, of course, a few file formats that are so inherently large
> that compression does still matter: JPEGS, MP3s, QuickTime, and other
> digitized data; and here developers still do pay attention to document
> size; but these are use cases XML is not intended for, does not serve
> well, and never will serve well. Not everything should be XML.
ps bad exmples, jpegs, mp3s. etc are highly compressed binary formats.
almost no redundant info at all. in fact as they are lossy compressions
they actually have less information than the originals.
tel;cell:+61 411 287 530