Lists Home |
Date Index |
Mike Champion wrote:
>[Quoting from Bray, St. Laurent, and de Hora ...]
>>>>Pelegri-Llopart said "The main point here is there
>>is almost an order
>>>>of magnitude between straightforward Web services
>>using XML encoding
>>>>and an implementation that takes care of binary
>Is anyone disagreeing with that assertion? I hear it
>from a lot of apparently independent sources.
>Presumably we'll see data at the "Binary Infoset
>Serialization" workshop next week.
Just got back from some travels, so here's my belated contribution to
the thread, arbitrarily responding to this one post...
It's clear that transformations of XML into alternative representations
can result in substantial increases in performance over conventional
text parsers (at least for Java), and can also result in substantial
reductions in document size. I haven't personally investigated the
innards of the parsers to see how well they've been structured for
performance. This is an area that's gotten a lot of attention over the
last few years, though. Given that there are some very sharp people
working in the parser area I suspect performance is approaching optimal,
and I've based my investigations on that assumption.
I have a hard time communicating with the hard-core text backers who
appear to see any transformation of XML (other than gzip, which
apparently is blessed by virtue of predating XML itself) as inherently
evil. It's all just bits on the wire (or voltage levels, or photons, or
...), after all, and if a transform that's based on XML structure can
deliver good performance results it seems worth a look. My personal
preference is for transforms that preserve the XML Infoset, but
Schema-based transformations such as the one used by Sun can probably
get about twice the overall performance of Infoset-preserving transforms
such as my own XBIS (http://xbis.sourceforge.net) for SOAP-type
applications (assuming relatively heavy use of numeric values). Both
types are probably worth considering.
Either way, if a tool is available that converts the transformed
document back into XML that's equivalent (modulo canonicalization) to
the original document, does the fact that the transformed representation
uses funny bit fields or binary values really matter? It seems to me
that it's just another processing step in a pipeline.