Lists Home |
Date Index |
- From: "Stephen D. Williams" <firstname.lastname@example.org>
- To: Jonathan Borden <email@example.com>
- Date: Fri, 26 Mar 1999 14:10:10 -0500
Jonathan Borden wrote:
> Tim Bray wrote:
> > At 08:54 PM 3/25/99 +0000, Dan Brickley wrote:
> > >Quite so. But there are still initiatives such as
> > >
> > > http://www.wapforum.org/docs/technical.htm
> > > http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf
> > I read some of it, and if you buy the idea that a binary form of XML
> > is useful, it seems quite sensible. I'm agnostic; if they think they
> > need it who are we to tell them they don't? Obviously it has to
> > round-trip with plain ole XML. -T.
> I think what this really is, when you strip out the concept of binary XML,
> is a suggestion for a compression format tuned for markup streams.
> There are two distinct issues 1) efficiency of parsing 2) compactness. A
> standard compression format for XML (ala zip,gzip etc) would be for
> bandwidth limited applications.
I agree. I feel they can be solved with a similar solution in at least some circumstances.
Rather there are some straightforward ways to acheive compression that actually make
efficiency worse while some solutions for efficiency also make compression easier.
In fact there are a number of levels you could go with compression:
optional gzip/bzip2 possibly preceded by:
Dictionary compression (various forms of building a list of commonly used terms or all terms
in the current document/stream or some combination)
'Priming' for certain circumstances. For instance, I've long thought that an ideal design for
super high bandwidth circuits (TCP connection, message queue, special purpose) is to
essentially start out with a raw state where you send, once per connection/conversation, all
of the XML or other full self describing data (a DTD is an expression of this) and possibly
even a dictionary built from past experience and then highly compress the rest of the stream
based on the defined base. In some circumstances you could even have a base 'dictionary'
stored on each receiver to improve short messages.
Each further transaction could use all of the known information to compress in a layered way.
There are plenty of circumstances where a connection is made and many messages are sent,
sometimes millions per connection. I've had servers that normally handled 30-50 million
Both careful structuring of the data (a la bXML) and things like parallel inheritance delta's
play into this kind of optimization.
> Jonathan Borden
> xml-dev: A list for W3C XML Developers. To post, mailto:firstname.lastname@example.org
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:email@example.com the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:firstname.lastname@example.org the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:email@example.com)
OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS
firstname.lastname@example.org Stephen D. Williams Senior Consultant/Architect http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999
xml-dev: A list for W3C XML Developers. To post, mailto:email@example.com
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:firstname.lastname@example.org the following message;
To subscribe to the digests, mailto:email@example.com the following message;
List coordinator, Henry Rzepa (mailto:firstname.lastname@example.org)