OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Textual transmission typically faster?

>Network latency, flawed application design (e.g., not pooling database
>connections), poorly-designed databases, and poorly-optimized queries all
>contribute a greater performance penalty than the difference between using
>text or binary message formats.

This is true, but these are all identifiable and correctable flaws. The
question is whether in a system that is optimal in other respects, could the
performance hit of using XML (text) be justified?

On Paul's point :
"In one example, it seems people are using XML as an abstraction
between the application and database more and more.  This puts a
strain on the throughput for these types of applications where
database activity is high.

Which is not to say that a binary format would solve this issue, but
it would help."

The throughput issue is a valid concern, but not a major problem in my
opinion. Talking of XML as an abstraction between app and DB, most DBs at
present are relational, and the data doesn't really need much in the way of
tree structure (any object-relational mapping will usually be on the
application side). This means that the overheads demanded by tree
representation (i.e. DOM) aren't needed, and the serial translation of e.g.
SAX is low-cost, and in effect linearly scalable.

There are two potential problems with using text-based data - processing
overheads (for conversion) and need for greater bandwidth/storage. As far as
conversion is concerned, while RDMS predominate XML is ok because the
structure of the XML will be simple. If and when object DBs become the norm,
this will still be ok as the mapping will be trivial on the app side. As far
as bandwidth/storage is concerned, a lot of existing systems allow for the
extra cost of using text (balanced against simplified conversion), e.g. the
use of HTML on the web.

In my opinion the interoperability afforded by XML far outways any minor
performance hit, though this is entirely dependent on sensible
implementation - you don't want to be converting to and from XML several
times over a data path.

The introduction of a binary format raises the spectre of coupling between
sections of a system. Using XML means there is a clearly defined interface,
so any changes to individual parts of a system can be carried out in
relative isolation.

The idea of using binary formats seems reasonable where conversions could be
identified as causing a bottleneck, but I'd suggest that the abstraction
model of XML is maintained, i.e. the internal structure of the binary maps
one-to-one with XML, which shouldn't add much weight to conversion
(retaining interoperability), but would retain a certain degree of
transparency,  minimising the performance cost of conversion for external

An example of a binary format reducing interoperability is MS compressed
HTML. This is used in some Windows help systems, but is out of reach to
other systems such as JavaHelp. It seems unlikely however that MS came up
with this format with efficiency in mind, copyright paranoia perhaps being a
more likely motivation.