Lists Home |
Date Index |
Elliotte Rusty Harold wrote:
>At 3:01 PM -0800 11/20/03, Dennis Sosnoski wrote:
>>I think these are all important concerns. I do find it a little
>>baffling that so many people recognize (1) as a valid concern and
>>willing endorse using gzip transformations of XML documents to
>>address it, while refusing to recognize (2) and (3) as valid
>>concerns or accept other types of transformations of XML documents.
>I think it comes down to a layering issue. gzip can be applied to the
>binary stream. To a large extent, this does not affect or change the
>XML format at all. It's simply a different binary encoding of text
>data, and the text data is what is real. gzip knows nothing about XML
>and doesn't need to. 2 and 3 have to understand the XML document as
>an XML document to operate. That's a horse of a very different color.
>Consider what happens when parsing: if I have a gzipped document I
>first decode it into genuine XML and pass that into an XML parser. If
>I have an ASN.1 or Sosnoski format document, I use a different parser
>to decode the data directly into different objects. I neither create
>XML nor use an XML parser.
Good point, but (at least in the case of the XBIS format) the
application still sees a standard SAX parser interface. It would even be
possible to generate text directly from the XBIS format and parse that
with your parser of choice, though doing so would obviously eliminate
the performance benefits.
My original intention with the XBIS format was to provide an efficient
way to move document model representations between application
components. It's much more performant than text for this purpose. It was
interesting to add support for a stream interface as well, so that the
sender can use an event stream (from parser or other source) as input
and the receiver can get an event stream as output. That makes it a
general-purpose alternative to using text for communicating Infosets
between components. I think this is an important area of concern,
especially as the use of what might be called "embedded" XML grows.
When dealing with the outside world many applications want to use XML
for all the standard reasons. In reality, though, it's only the Infoset
that they care about. Once the data has been brought into the
application I'd argue that there's little or no benefit to using actual
XML text for moving the document Infoset between components. Consider
the case of a Session EJB returning an Infoset that's going to be used
by a JSP output page, for instance - I can see no plausible benefits
from using XML text for the transfer from the EJB to the JSP, and many
drawbacks (starting with the loss of typing that comes from the EJB
returning a String that's actually the text of an XML document).
Web services are a murkier area than my EJB to JSP example, but one
where I also see a use for either XBIS-like or ASN.1-style alternatives.
It'd be great if Web services could state that they accept text, gzip,
or XBIS-like and ASN.1-like formats and allow the client the choice of
how to communicate. That way the clients could use text for convenient
debugging, then switch to an alternative for better runtime performance.
The actual service code would use a single interface in all cases, with
the transport differences handled by the framework. Seems like an ideal
tradeoff to me.
Dennis M. Sosnoski
Enterprise Java, XML, and Web Services Support
Redmond, WA 425.885.7197