OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] Fast text output from SAX?

[ Lists Home | Date Index | Thread Index ]

Elliotte Rusty Harold wrote:
> Bob. You may not need to be lectured on this, but 
> some other people do,as the plethora of software that
> crashes on unexpected input proves. It has been 
> proposed in this very thread to use binary formats 
> precisely to avoid the overhead of checking for data 
> correctness. Just slam some bits into memory and 
> assume everything is hunky dory.
	You're right. After 30 years of this, I've heard the lectures.
And, I know that you realize that if we were in Brooklyn having dinner
together, I wouldn't be lecturing you either. But, there are other
folk reading this as well...
	Just like you, I groaned when I saw the suggestion that you
could take "wire-protocol" and then just stuff it into memory. This
might work with text, but it sure as heck doesn't work with binary
formats or anything that contains an address or offset. The
distinctions between wire-protocol, in-memory-format, and
on-disk-format, are fundamental. Every proposal that I've ever seen
for a "common" format for use in two or more of these contexts has
ended up failing for one reason or another. As far as stuffing
wire-protocol into memory goes: Let me just say that *NOBODY* is ever
going to write to *MY* address space without a great deal of checking
going on... Also, if this problem was as simple as just replacing
direct addresses with relative addresses, don't people realize that we
probably would have figured this out a few decades ago? As an
industry, we're not so stupid that we would missed something so
obvious... Some times, the obvious solution is *SO* obvious that it
must be flawed.

> I have seen any number of binary formats that achieve
> speed gains precisely by doing this. 
	People write stupid and buggy code every day and the specs
they write are often even more stupid. So what? The issue here isn't
whether or not stupid people have made or can make mistakes. There are
some fairly dumb ways to do text-based markup too. Getting back to
Claude Bullard's point: One really important reason to have a standard
is to come up with a solution that will prevent a large number of
people from having the opportunity to do stupid things. (i.e.
disincent them from trying to create something new and thus making all
the same old mistakes again.)

> And it is my contention that if this is disallowed
> (as I think it should be) much, perhaps all, of 
> the speed advantages of these binary formats disappears.
	This just isn't true -- at least not for properly written
ASN.1 binary codecs. (And, the ASN.1 XML codecs are just as robust...)
The benchmarks presented at the Binary workshop show this to be the
case. I'm sure you've read the papers. What is it that you don't

		bob wyman


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS