[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Xml-bin] RE: Another binary XML approach
- From: Derek Denny-Brown <derekdb@microsoft.com>
- To: xml-bin@warhead.org.uk,Olivier Dubuisson <Olivier.Dubuisson@francetelecom.com>,Stefan Zier <Stefan.Zier@syntion.com>
- Date: Thu, 12 Apr 2001 14:42:14 -0700
I'll finally chime, in on this issue, since it seems to have fallen down
to a question of how worthwhile binary/tokenized XML really is.
>avoid reinventing the wheel
Reinventing the Wheel, is always an issue, but there is a time and a
place. Why XML at all since you could just use SGML? To steal from Tim
Bray's recent hit/miss presentation; one good reason to reinvent is to
adjust an existing standard to better 'hit' the 80% that matters.
>embedded devices, high-volume transactions, efficiency, compression
ratio,
I'll rephrase this in a form that includes more quantifiable items:
- parser size/complexity
- parse time
- file-size
In a prototype of a tokenized XML format, these results came out to
approximately
- parser size/complexity roughly 10:2
- parse time roughly 10:1
- file-size -10%
There was no compression in the new format. The original file was
ASCII, and the 'binary' form was UTF-8, so these numbers are optimistic
for non-Anglo centric documents. I also had a version which used
UTF-16, which was faster (15:1), but produced larger documents (+60%).
These numbers are _very_ compelling, and I think are enough to warrant
serious investigation, and possible standardization.
One of the significant aspects was that I could write a non-validating
parser in less than a day. Writing a fully conformant non-validating
XML parser is a much harder task. There are disadvantages to this.
This means that every product group this side of Pluto will author their
own binary-xml parser, and many will be slightly non-conformant. On the
plus side, that means that every product group this side of Pluto will
be using XML(-ish). If the format is extensible, so that it is possible
to stick application specific blocks of data inline, then it will be
much easier for groups to move from a purely proprietary solution to a
XML-centric solution.
One worry I do have about a standard, is that the format will bloat. If
a standardization group does form, it should be a hard limit that a
parser for this new format could conform to the 10:2 ratio I mention
above, or something close to it. Feature creep is something which must
be fought tooth and nail, or else there is no purpose to creating the
new format.
derek denny-brown
-- Technical Lead: MSXML --
<http://msdn.microsoft.com/xml>
p.s. I am speaking in no official capacity when I say any of this,
rather they are my personal opinions and should only be regarded as
such.
-----Original Message-----
From: Bullard, Claude L (Len) [mailto:clbullar@ingr.com]
Sent: Thursday, April 12, 2001 1:51 PM
To: Olivier Dubuisson; Stefan Zier
Cc: xml-dev@lists.xml.org; xml-bin@warhead.org.uk
Subject: [Xml-bin] RE: Another binary XML approach
From: Olivier Dubuisson [mailto:Olivier.Dubuisson@francetelecom.com]
>avoid reinventing the wheel
Given the number of binary proposals and extant implementations,
that can't be avoided. A WML developer would claim ASN.1 and
a French telco are reinventing the wheel. A number of other
companies probably are as well.
>embedded devices, high-volume transactions, efficiency, compression
ratio,
No proof has been offered that the solutions for items one and two
converge
in the qualities of items three and four. Anecdotal evidence has been
offered from developers that the time spent in parsing and the
compression ratio gained does not warrant the effort.
>careful design) that my company (with others) supports what is
described
at:
That is not easy to quantify. ASN.1 exists and your company and unnamed
parties support it. So far, that is the evidence offered. I have
looked at the documents at the sites listed. So far, no quantitative
evidence to support the claims is offered.
I accept that there are those who assert the need for the standard
XML binary. So far, that is easy to discern. No case has been
made for any approach or requirements offered that would enable
reasonable minds to choose among approaches offered. It is fine
to take this offlist, but at some point XML-Dev list members
might reasonably expect results of the discussions to be
brought back for consideration.
Len
http://www.mp3.com/LenBullard
Ekam sat.h, Vipraah bahudhaa vadanti.
Daamyata. Datta. Dayadhvam.h
_______________________________________________
xml-bin mailing list
xml-bin@warhead.org.uk
http://lists.warhead.org.uk/mailman/listinfo/xml-bin