[
Lists Home |
Date Index |
Thread Index
]
At 10:24 PM -0400 4/15/04, Stephen D. Williams wrote:
>What do you use for data transfer??? I almost never get data
>corruption that isn't corrected in some way, and I constantly use
>WiFi, CDMA2000 based cell Internet access, all kinds of computers,
>harddrives, etc. Not since I last used my Jazz drive have I had the
>kind of corruption you seem to be dealing with. I did have trouble
>with a particularly ugly multi-drive RAID-5 failure, but files were
>either good or bad.
I use anything and everything, and sometimes the files get corrupted.
It doesn't matter why: whether it's a transport error, bad data error
on disk, or a misbehaving application that writes bad data into the
file. Corruption happens. Fact of life.
>If the session layer, i.e. TCP/IP or the filesystem, doesn't find
>errors, the application managing transfer should (email, etc.).
>Certainly, the application, or better, the library that is accessing
>the data should detect and react well to any data presented.
Certainly it should, but it doesn't. Word's the most common offender here.
Part of making an application robust against any input is starting
from the assumption that you have nothing more than stream of bytes,
and it must be proved to be in a particular format before using it.
This is essentially what a parser does. This is why XML parsing is
such a robust process. It's very hard to construct a stream of bytes
that will crash a parser. Possibly you could do it with very long
element names or attribute values, but so far I haven't seen it
pulled off.
However, most processors of binary formats such as Word do not start
with the assumption that they are reading an arbitrary stream of
bytes. They assume they're reading data in a known format and build
assumptions about the format into their code. When those assumptions
are violated, the program heads south in unanticipated and
potentially damaging and dangerous ways. This is why it really
bothers me when processors attempt to gain speed compared to
traditional XML parsing by skipping well-formedness checks. This
applies to both many binary parsers and some so-called minimal
parsers that process traditional XML without checking for
well-formedness.
--
Elliotte Rusty Harold
elharo@metalab.unc.edu
Effective XML (Addison-Wesley, 2003)
http://www.cafeconleche.org/books/effectivexml
http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA
|