Lists Home |
Date Index |
Michael Champion wrote:
> Let me summarize what I took away from numerous presentations and
> discussions of this subject at XML 2004.
This is generally accurate, but allow me to alter a few points.
> - There is a community of people who wish to leverage much of what the
> world thinks of as "XML", including SAX, DOM, XSLT, XSD, and the
> software and documentation support these things, but finds that in
> practice the XML syntax is too verbose (and/or resource intensive to
> process) in their domain.
In truth, more than just integration into the XML stack with its APIs,
tools, specs, etc. (even though that's of course a huge part of it) you
realize that those communities actually also insist adamantly on much
more intrinsic properties of XML such as its genericity or support for
open content (to pick just two). They want the more fundamental goodness
that made the ecosystem possible. Of course trade-offs are to be made
since you can't gain much without sacrificing a little, but the idea is
to stick to as much of the original properties.
In fact if you tell them to forget about XML entirely and name the
properties that they would absolutely 100% require of a file format,
without which it would not be usable to them, you end up with XML plus
some efficiency and minus human readability/editability. The sheer
amount of overlap (and the absence of extraneous features) is quite
striking. I hope the XBC's final documents will make that clear.
> <?xml version="1.0" encoding="W3CBinaryXML"?>
> [binary gibberish I have no software to process but others will]
That has been given thought to but I think everyone's on the same page
now that it's an ugly hack :)
> - Binary serializations of the XML infoset have already been created
> that are are capable of pretty decent compression or parsing
> performance. See the citations in the XML 2004 papers that are
> online. There are plenty of academic and quasi-academic papers on
> this. The interesting question is whether any can get sufficiently
> better compression AND performance (and a bunch of other attributes)
> than XML text to make it worthwhile for a wide range of uses. The
> Binary XML Characterization WG is defining the criteria by which this
> might be determined.
It is known that you can get both better compression and better
performance. The XBC cannot publish the measurements (the W3C isn't
Consumer Weekly) but we'll list a series of formats that display such
attributes and a list of properties that can be measured against them by
third parties to their hearts' content.
> - "Binary XML" is happening, whether that is an oxymoron it or not.
> There are well over a dozen format proposals that have been made
> publicly available, and probably dozens more that have not. For
> example, I recall Michael Rys saying at XML 2003 that SQL Server 2005
> uses a proprietary binary encoding internally to store XML compactly
> and in a way that is efficiently processed with XML APIs or serialized
> into XML text. I suspect that many other XML DBs do something
> similar. I believe that some XML hardware middleware vendors do as
> well. Many of these are conceptually serializations of SAX event
> streams, so they have a deep "XML" heritage and easy integration with
> applications that work with SAX parsers.
Yes, the problem though is not at all with the ones that are internal
(tools like PerlSAX or Cocoon have also had those forever) but with the
ones that involve interchange.
> - The really contentious issue is whether one or more of these formats
> should be standardized, and who should do the standardization (e.g.
> W3C or the wireless industry).
The problem here is that the dichotomy between wireless and PC is bogus,
or if it isn't yet in your part of the world it'll be very soon. Even
the US has caught up to the point of being less than two years behind
(which I guess just leaves us with France where the tech's there but the
pricing structure is lagging behind in the neolithic). So it pretty much
boils down to a question of which standardization organization's produce
would you rather be dealing with when it comes to XML technology (or,
for some here, which do you hate least).
> - Another point of contention is whether a binary XML encoding would
> undermine or enhance XML's interoperability and ubiquity. Elliotte,
> Uche, and others have vociferously made the "actively damaging to XML"
I have no doubt that "unreadable XML" has its downsides, I work with
both that and XML and could sure summon up times when it's bitten me the
way text wouldn't. But you're going to get binary anyway and "unreadable
XML" beats unreadable goop any day of the week. Without it it'll be like
getting winmail.dat attachments in mass.