OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: Binary XML

[ Lists Home | Date Index | Thread Index ]
  • From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
  • To: Joshua Allen <joshuaa@microsoft.com>, 'Mike Sharp' <msharp@lante.com>,xml-dev@xml.org
  • Date: Wed, 27 Sep 2000 15:17:19 -0500

From: Joshua Allen [mailto:joshuaa@microsoft.com]

Joshua writes:

>Actually, there should be no need for the user to manually create a
>compression type.  The source code for the compressor is available; you
>notice that it simply scans the XML and uses some knowledge derived from
>that analysis to "prep" the document for passing through zlib (the freely
>available linkable library implementing gzip).  So in a sense you could
>consider this "gzip on steroids".  

Ok, so there is a way to automate the path based user compression.  How 
good is the analysis?  

I was giving this some thought in context of schema based XML where one 
can know apriori, the potential paths just as one can use the structure 
to know the potential queries.  I haven't opened the code, so is the 
automation "A" means or "the" only means provided?  In other words, can 
a human user provided with a tool (say schema-based) prepare a set of 
candidate paths to fine tune the compression.  It might be tedious 
but considering that the majority of transactions based on XML documents 
usually evolves into a finite family of types, it can be worth it.  Consider

it one more example of "front-loading the pain".

>(Of course, you would have to convert the
>proprietary data type to XML to give XMill something to work with, and
>thereby be embedding some hints about the structure of the data, so you are
>right about user involvement)

In context of my original question, that is OK because XML is what I care 
about coming out the gate.  That I need to XSL or persist XML is fine 
because I am assuming that (perhaps naively).  I leave the issues of 
the proprietary data types to another day.  My interest is in a general 
purpose compression for XML if WBXML is not that.

>Text-based XML was fast
>presumably because of gzip-based compression on the wire.  Since XMill
>essentially uses gzip for the heavy lifting, you could expect the
>burden to be just about the same, but with better compression.  

Ok, so this augments gzip by reorganizing/regrouping.  I note they state 
that even though compression is better, the performance is about the same 
which contradicts the standing wisdom that better compression is bought 
at the cost of performance.

>I personally
>would have liked to see some COM and Java wrappers for XMill freely
>available (not just transmission; think about in-memory caching for
>expensive-to-create but infrequently used XML), but I gave up after getting
>bogged down in the spaghetti.  I think researchers write such sloppy code
>a way to give the rest of us something to feel good about.

That is what I expect Microsoft to do or enhance.  You build frameworks. 
I apply them.  :-)  It may be that the XMill technique is general enough 
that it is worth building components from scratch for release as part of 
the Visual toolkits or open source.

Len Bullard
Intergraph Public Safety

Ekam sat.h, Vipraah bahudhaa vadanti.
Daamyata. Datta. Dayadhvam.h


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS