OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] Microsoft FUD on binary XML...

[ Lists Home | Date Index | Thread Index ]
  • To: "Michael Champion" <mc@xegesis.org>,<xml-dev@lists.xml.org>
  • Subject: RE: [xml-dev] Microsoft FUD on binary XML...
  • From: "Joshua Allen" <joshuaa@microsoft.com>
  • Date: Tue, 18 Nov 2003 16:49:56 -0800
  • Thread-index: AcOuMmILH2v7529KSZWl29Cgb3n3PQAAuIKg
  • Thread-topic: [xml-dev] Microsoft FUD on binary XML...

> -----Original Message-----
> From: Michael Champion [mailto:mc@xegesis.org]
> Sent: Tuesday, November 18, 2003 4:15 PM
> To: <xml-dev@lists.xml.org> <xml-dev@lists.xml.org>
> Subject: Re: [xml-dev] Microsoft FUD on binary XML...
> On Nov 18, 2003, at 6:32 PM, Joshua Allen wrote:
> > - people gripe
> > about parse speed of XML as if it will be faster when it's binary,
> > I
> > think this is incorrect from two perspectives -- first is that we
> > shown XML-oriented protocols to be faster than binary in many cases,
> > and
> > second because there is still tons of room for improvement in text
> > parsing speeds (the fact that gen 1 of XML parsers is slow simply
> > proves
> > that they are gen 1 parsers, not that text is inherently slower than
> > binary).
> >
> Hmm.  At the binary  XML workshop [yeah, yeah, "binary serialization
> the XML Infoset"] , lots of people were talking about XML being 10x
> slower than comparable binary technologies.  (Mind you, I personally
> think this is a very reasonable price to pay  in most circumstances,
> but I would like to get the facts straight).
> Can you point to anything public that shows that XML-oriented
> to be faster than binary?  Again I agree that XML parsing is seldom a
> bottleneck, so XML *applications* are often just as fast as binary
> ones, but I'm not so sure about "protocols."

Well, as I understand the examples I've seen, some of the problems with
these binary formats is that they tended to evolve, and the new
information embedded in the binary stream is not always added in a way
that is most amenable to efficient parsing.  So the idea of "fast
binary" made sense at V1 of the protocol, but things become a mess over
time and much of the supposed benefits of binary parsing are reduced by
all the special-casing code.

> On the "gen 1-ness" of XML parsers, that was a very good point I
> learned about at the workshop. On the other hand, many of the
> optimizations to produce significant speedups depend on a shared
> schema.  I have a philosophical question:  If an XML distributed

I'm definitely not talking about optimizations that take a shared
schema.  We got pretty dramatic improvements in parse speed between V1
and V2 of our parsers (caveat, not shipping, so no official promises,
but...)  And this is pure text parsing; and as far as I have seen, our
V1 parser was pretty fast compared to most of the parsers out there that
people complain about being slow.

> just tend to see XML as *only* an interchange format between
> objects and applications rather than something that would be natively
> stored, processed, displayed, etc. in an application-neutral or
> schema-neutral manner.  I suspect the party line will change when

Well, hold on -- you are aware that MSFT has been shipping a
parse-speed-optimized binary format for XML since SQL 2000 shipped, so
it is not as if MSFT is totally opposed to native binary encoding of
XML.  My point is simply that "native binary encoding" is not an interop
scenario.  It's a very valid issue for specific client scenarios, server
scenarios, and so on -- but it's totally counterproductive to interop.
To me the choice of native storage format and encoding is very specific
to an application scenario and will always be optimized regardless of
what standards anyone tries to invent anyway.  For example, I have seen
people recently asserting that one should design exclusively for REST
communications with a local storage engine, which IMO is just insane --
REST is great for interop, but on my local box I want tight-coupled.
Same goes for XML-text; it is great for interop, great for hierarchical
access to data, semistructured/document data and so on -- but sometimes
you want something more tightly-coupled.  Nothing worng with that, but
just don't call it interop.


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS