[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Images embedded in XML
- From: Eric Bohlman <ebohlman@earthlink.net>
- To: "Al B. Snell" <alaric@alaric-snell.com>, Danny Ayers <danny@panlanka.net>
- Date: Sun, 08 Apr 2001 01:48:27 -0500
4/8/01 4:54:50 AM, "Al B. Snell" <alaric@alaric-snell.com> wrote:
>This is a bit of a myth... plain text is less of a common demoninator
>than two's complement or unsigned binary integers, since plain text is
>described in *terms* of this, and XML is far from a simple "lowest common
>denominator" data format; it would take me a few minutes to write a set of
>routines in C to serialise and unserialise data in network byte order, and
>perhaps an hour at most to implement this for a self-describing
>format. Compare that to how long it takes to implement an XML parser, and
>the size and run time space/time requirements, and the size of the XML
>documents compared to the "XDF" records...
IMHO that's an extremely unfair comparison. I very much doubt that in a few
minutes you could come up with a reliable, tested, general-purpose data
serializer, as opposed to a quick application-specific hack. Most of the time
spent writing an XML parser isn't spent on bare-metal coding; it's spent on
the less geeky aspects of programming like testing, requirements analysis, API
documentation and development (the API that's only for your own personal use
is always the easiest to develop, since you already have it internalized),
design for maintainability, and maintenance. It's like the way geeks always
ridicule the (very well documented) statistic that if you divide the number of
lines of code a programmer writes over the span of a project by the total time
he or she spends on the project, you get about ten lines of code per day. The
catch is that the typical programmer spends most of his time on activities
other than writing lines of code (even though those activities, such as
requirements analysis, may eventually contribute to writing code). The time
to develop and produce a program is always a lot greater than the time to code
it).
Application-specific hacks are always more efficient than general-purpose
libraries when the entire project is under the complete control of a lonergeek
cowboy coder who gets to define all the requirements himself and who's going
to be the only person ever to look at the code. In the real world, things are
a little different.
>
>Since parsing XML is complex, XML's adoption will be limited by the
>development rate of XML parsers... I have heard a few people say "Ah, I
>can write an XML parser in 10 lines of Perl", but those parsers don't
>process entity references or handle namespaces :-)
But parsers that do all that are easily available to any competent Perl
programmer, who doesn't have to write the parser, only use it. The only
problems arise when the Perl code is supposed to run on some el-cheapo rented
Web host whose admin doesn't even understand how to install Perl, IOW under
extremely penny-wise and pound-foolish conditions.
>
>> Either you do
>> without compatibility between systems or you sacrifice a bit of speed.
>
>Only a tiny smidgen... many CPUs can do network / host byte order
>translation in a single instruction; compare that to the time taken for an
>XML parser to locate a text node by stepping through to find delimeters,
>then removing the whitespace and converting from UTF8-decimal to an
>integer in host byte order...
You're thinking in terms of saving CPU cycles. In the Real World, most of the
time isn't spent executing (cheap) instructions; it's spent by (expensive)
people trying to solve "impedance mismatches" between two organizations'
notions of what their data should look like. Once again, XML has few
advantages in systems that are entirely under the control of a lone coder.
The advantages come into play when you have to exchange data with multiple
organizations; sure writing low-level conversion routines to handle it is fun
geektoy stuff, but you're ignoring the non-geeky stuff like maintaining and
documenting all those conversion routines.
And let's not forget the *social* aspects (the ultimate non-geeky stuff) of
data interchange. When several unrelated organizations, or even departments
within an organization, need to exchange data, there's an enormous advantage
to using a data format that was created by a third party rather than by one of
the players, namely that there's no rivalry over *which* player gets to create
the format. Again, if one party could simply impose a format by fiat,
everything would be cool, but in real life, if you don't get full "buy in"
from all the players, you're going to see a lot of friction (usually in the
form of "creative incompetence" where everybody's implementations differ in
slight but important details) that will dissipate a lot of energy as heat.
Yes, this falls into the realm of what hardcore geeks would call "touchy-
feely" stuff, but the fact is that psychological/verbal/non-
quantitative/stereotypically-female/"touchy-feely" considerations play
important roles in any real-life human endeavor involving more than one
person, and the fact that one might be more confortable with bits and chips
than with human interactions doesn't change that reality.