OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] The Rising Sun: How XML Binary Restored the Fortunesof Inn

[ Lists Home | Date Index | Thread Index ]

my 2c on this (again...)

1. there seems to be use cases where something other than "text" xml 
would be useful while still preserving the infoset
2. xml, per se, doesn't know about types (xsd does...) so anything that 
assumes "x" is a number has to be lossy
3. as far as i can tell most use cases are about transporting xml inside 
an application. eg sending to mobile phones, printers, etc - in which 
case there are better ways to do this
4. instead of a "binary" xml format - what about a "dom transport" 
format? this could be dense and lossless:

header - element dictionary, attribute dictionary, processing 
instructions (can't avoid sending these as text)
body - rle encoded infoset - element code/num attributes/attribute 1 
code/length/data/..../element length/data - and it's recursive (do you 
want the bnf?)

in terms of bits on the wire this wouldn't have to based on 8 bits, it 
could use the good old msb=1 coding to get better compression

yes, this is a variation on gzip etc. the important difference is that 
the message is parsed before transmission and as a transmission ONLY 
format it does not have to allow for updates etc. it can be fed straight 
into a dom or sax tool without parsing overhead or decoded back to xml 
for storage manipulation. as a byproduct our mobile friends win on 
transport bits and processing cycles. a closed application could 
potentially even drop the headers (or cache them if they are standardised).

now theres a market for:

1. standard dom representation (?)
2. dom transport encoder
3. dom transport decoder

and in true open standard tradition companies can produce middleware for 
each of these.

there are a few flies in the ointment - mostly to do with java, but we 
can work on those. (just for the record i live in c world)


ps what i'm really saying is the effort should NOT be in binary xml 
(that implies update etc and a whole set of parallel or modified ws-* 
etc), but i think there is there is a compelling case for an xml 
transport format.

Michael Champion wrote:

>On Apr 7, 2005 12:37 PM, Elliotte Rusty Harold <elharo@metalab.unc.edu> wrote:
>>Far too much effort has been spent shouting supposed self-evident truths
>>about how much faster/smaller/sexier binary formats are. Little in the
>>way of evidence has been produced, 
>As someone who has been (personally) receptive  to the idea of "binary
>XML" technology and possibly even a single standard IF it could cover
>a wide range of cases , I must say that this was my biggest
>disappointment when reading the XBC final document. They developed
>decent use cases,  attributes to measure, and measurement
>methodologies, and collected a number of prototype formats and
>implementations, but didn't put them all together and generate some
>numbers.  Rather than those Yes/No entries in the table for how the
>formats met the criteria, someone could actually run tests and plug in
>real numbers.  I can only assume that those who know what the numbers
>would be also know that they do not support the case that a single
>"binary XML" format can be both small enough and fast enough (not to
>mention the 15 other attributes!) to make much of a difference. 
>Otherwise, why not publicize them and address the concerns that the
>TAG expressed?
>I fervently hope that the W3C doesn't move forward with a followon WG
>until that quantitative evidence is available to all concerned. Until
>then, we're back where we started - we know that some formats are
>dramatically faster OR smaller than XML for some set of use cases, but
>we don't have any reason to believe that one format can do it all
>across use cases, or even that one could hit a reasonable 80/20 point
>compared with XML. The idea of forming a WG to do some computer
>science here, trusting that an acceptable size/speed tradeoff is
>possible while maintaining XML API/tools/datamodel/etc. compatibility,
>is at best implausible at this point.
>>Personally, I'm not convinced that there isn't another factor of ten
>>performance gain to be had in the world of real XML parsing, though
>>doing that will probably require ditching DOM and SAX in favor of a more
>>performance tuned API. Still, I doubt we've yet hit the end of the line
>>when it comes to XML parsing algorithms. I could well be wrong about
>>that, but it would be truly ironic if two weeks after binary XML goes to
>>REC, some grad student somewhere releases a text parser that beats the
>>pants off the binary parsers.
>Noah Mendelsohn made that point in a very compelling way at the W3C
>binary workshop in September 2003.  He reiterated it on the TAG list
>the other day, noting that the mainstream Java XML tools were designed
>for conformance, not performance, and serious brainpower is only now
>beginning to be thrown at the problem of efficiently parsing XML and
>conveniently consuming the result.
>>I suspect this would be more likely to happen if companies like Sun
>>devoted their brain power to XML parsing algorithms rather than
>>inventing new formats.
>I think it's important to focus on people/problems/projects and not
>who they work for. MS as a whole is extremely skeptical about "Binary
>XML" standardization, but nobody has decreed that Binary XML is Evil
>and is off the radar as a solution.  Sun as a whole seems receptive to
>standardization, but  I doubt if anyone there has decreed that Binary
>XML is the One True Path and parser optimization is pointless.  Almost
>everyone I talk to on both sides of the philosophical divide (or the
>.NET / Java divide)  has an appreciation of the dilemmas, and I'll bet
>there are material Day Job rewards waiting for clever solutions to
>actual customer problems, whether they involve new formats, new
>algorithms/APIs, finding an efficient subset of XML that hits the
>99/01 point for their use cases, whetever. The most important thing
>IMHO is to not prematurely standardize on something we will all regret
>in a few years, but not to prematurely reject options based on
>preconceptions of one sort or another either.
>I do think that the burden of proof is on those who would break things
>that actually work without compelling evidence that would make us much
>better off down the road. To be fair to the XBC people,  until we
>really and truly see XML 1.0 working for the use cases that they laid
>out, I'm planning to keep an open mind about how to make the XML
>family of technologies work better for all the users and potential
>users.  For now, various "binary XML" technologies definitely have
>their place in specialized domains and scenarios, but nobody has come
>anywhere near demonstrating that a "binary XML" standard would do more
>good than harm.  For that matter, I don't see how it is plausible to
>believe that a consensus can be reached on a  format that trades off
>the needs of the wireless people for cheap compression and the needs
>of the enterprise messaging people for processing speed.
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://www.oasis-open.org/mlmanage/index.php>

fn:Rick  Marshall
tel;cell:+61 411 287 530


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS