OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xml-bin] RE: Another binary XML approach



I'll finally chime, in on this issue, since it seems to have fallen down
to a question of how worthwhile binary/tokenized XML really is.

>avoid reinventing the wheel

Reinventing the Wheel, is always an issue, but there is a time and a
place.  Why XML at all since you could just use SGML?  To steal from Tim
Bray's recent hit/miss presentation; one good reason to reinvent is to
adjust an existing standard to better 'hit' the 80% that matters. 

>embedded devices, high-volume transactions, efficiency, compression
ratio,

I'll rephrase this in a form that includes more quantifiable items:
- parser size/complexity
- parse time
- file-size

In a prototype of a tokenized XML format, these results came out to
approximately
- parser size/complexity	roughly 10:2
- parse time			roughly 10:1
- file-size			-10%

There was no compression in the new format.  The original file was
ASCII, and the 'binary' form was UTF-8, so these numbers are optimistic
for non-Anglo centric documents.  I also had a version which used
UTF-16, which was faster (15:1), but produced larger documents (+60%).
These numbers are _very_ compelling, and I think are enough to warrant
serious investigation, and possible standardization.  

One of the significant aspects was that I could write a non-validating
parser in less than a day.  Writing a fully conformant non-validating
XML parser is a much harder task.  There are disadvantages to this.
This means that every product group this side of Pluto will author their
own binary-xml parser, and many will be slightly non-conformant.  On the
plus side, that means that every product group this side of Pluto will
be using XML(-ish).  If the format is extensible, so that it is possible
to stick application specific blocks of data inline, then it will be
much easier for groups to move from a purely proprietary solution to a
XML-centric solution.

One worry I do have about a standard, is that the format will bloat.  If
a standardization group does form, it should be a hard limit that a
parser for this new format could conform to the 10:2 ratio I mention
above, or something close to it.  Feature creep is something which must
be fought tooth and nail, or else there is no purpose to creating the
new format.

derek denny-brown
-- Technical Lead: MSXML --
<http://msdn.microsoft.com/xml>

p.s. I am speaking in no official capacity when I say any of this,
rather they are my personal opinions and should only be regarded as
such.

 -----Original Message-----
From: 	Bullard, Claude L (Len) [mailto:clbullar@ingr.com] 
Sent:	Thursday, April 12, 2001 1:51 PM
To:	Olivier Dubuisson; Stefan Zier
Cc:	xml-dev@lists.xml.org; xml-bin@warhead.org.uk
Subject:	[Xml-bin] RE: Another binary XML approach


From: Olivier Dubuisson [mailto:Olivier.Dubuisson@francetelecom.com]

>avoid reinventing the wheel

Given the number of binary proposals and extant implementations, 
that can't be avoided.  A WML developer would claim ASN.1 and 
a French telco are reinventing the wheel.  A number of other 
companies probably are as well.

>embedded devices, high-volume transactions, efficiency, compression
ratio,

No proof has been offered that the solutions for items one and two
converge 
in the qualities of items three and four.  Anecdotal evidence has been 
offered from developers that the time spent in parsing and the  
compression ratio gained does not warrant the effort.  

>careful design) that my company (with others) supports what is
described
at:

That is not easy to quantify.  ASN.1 exists and your company and unnamed

parties support it.  So far, that is the evidence offered.  I have 
looked at the documents at the sites listed.  So far, no quantitative 
evidence to support the claims is offered.

I accept that there are those who assert the need for the standard 
XML binary.  So far, that is easy to discern.  No case has been 
made for any approach or requirements offered that would enable 
reasonable minds to choose among approaches offered.  It is fine 
to take this offlist, but at some point XML-Dev list members 
might reasonably expect results of the discussions to be 
brought back for consideration.  

Len
http://www.mp3.com/LenBullard

Ekam sat.h, Vipraah bahudhaa vadanti.
Daamyata. Datta. Dayadhvam.h

_______________________________________________
xml-bin mailing list
xml-bin@warhead.org.uk
http://lists.warhead.org.uk/mailman/listinfo/xml-bin