Lists Home |
Date Index |
is it because I've been doing research in binary XML for the past few
months or do these questions seem to occur more frequently than they
used to? From what I gathered from the recent conferences I've been to
it may be a result of the boom of web services, but I'm unsure I have
sufficient data to assert that with certainty.
Gerben Rampaart (Casnet Mechelen) wrote:
> Isn't it a fact that the transport of both HTML and XML (especially XML)
> over HTTP is a bandwidth-absorbing thing? Big XML files that come from (for
> example) web services or simply B2B or B2C data transfer over the Internet
> take allot of time to send and receive, which makes connections and overall
> communications slower. This could be considered a problem, right? XML is not
> designed for a slow Internet. Is this correct?
Yes, that is correct. It doesn't apply solely to HTTP transmission but
in fact to any kind of transmission where bandwidth may be limited.
Choosing to ignore the memory problems related to holding potentially
large trees in memory, there are two factors that make XML difficult to
use in certain situations: bandwidth, and CPU (the cost of parsing is
high on constrained terminals).
This of course rarely affects desktop users that usually have "enough"
bandwidth and more CPU than you can safely shake a neural wand at. There
is however quite a number of situations in which either there is "too
much" XML (eg SOAP) or insufficient bancwidth and/or processing power
(constrained terminals such as set-top boxes, cell phones, embedded
systems...) and XML's verbosity becomes a problems.
Arguably, those are situations in which one could say that XML shouldn't
have been used to start with. However the people that develop such
systems have been sold on XML, and I'm finding that they have good
reasons to use XML more often than I expected. XML has a huge toolset,
many developers that know how to manipulate it, (sufficiently)
well-defined APIs, leads to formats that are far easier to maintain and
extend than ad hoc binary formats, encourages interoperability, etc.
That's why they want to stay with XML as long as possible in their
production pipeline, and only use something different on the wire.
> Which brings me to XML (let's not include HTML now, since the HTML
> transferred over the Internet (like a Web Page) is usually not so large) ...
> Since bandwidth is a way more precious 'good' than CPU cycles, can W3C come
> with a standard that compresses XML files before sending and decompresses
> when arriving. This will cost more CPU cycles but (much) less bandwidth.
If you are thinking of platform independent compression methods such as
gzip, those are already more than widely available. You mention HTTP
transmission: HTTP has built-in compression. However that solves the
bandwidth problem but not as you say the CPU problem. Even on a powerful
desktop, you don't want to have to decompress and parse XML metadata
embedded in a video stream as you need all the power you have to decode
and display the video.
> Obviously the standard (specification) should contain a platform
> independent way of compression.
There is such a standard specification, it's called BiM and is part of
> Now, before everyone jumps on me. Yes, I know this will run into problems
> with Security, because what would the compressed file contain? Will it be
> XML or a malicious script?
That's not more possible there than it is with unencoded XML.
> This was just a thought, maybe an impossible one, but I was wondering what
> the technical reasons are for this not to exist.
There are none, this exists :)
Robin Berjon <email@example.com>
Research Engineer, Expway
7FC0 6F5F D864 EFB8 08CE 8E74 58E6 D5DB 4889 2488