Lists Home |
Date Index |
- From: "Bullard, Claude L (Len)" <email@example.com>
- To: Joshua Allen <firstname.lastname@example.org>, 'Mike Sharp' <email@example.com>,firstname.lastname@example.org
- Date: Wed, 27 Sep 2000 15:17:19 -0500
From: Joshua Allen [mailto:email@example.com]
>Actually, there should be no need for the user to manually create a
>compression type. The source code for the compressor is available; you
>notice that it simply scans the XML and uses some knowledge derived from
>that analysis to "prep" the document for passing through zlib (the freely
>available linkable library implementing gzip). So in a sense you could
>consider this "gzip on steroids".
Ok, so there is a way to automate the path based user compression. How
good is the analysis?
I was giving this some thought in context of schema based XML where one
can know apriori, the potential paths just as one can use the structure
to know the potential queries. I haven't opened the code, so is the
automation "A" means or "the" only means provided? In other words, can
a human user provided with a tool (say schema-based) prepare a set of
candidate paths to fine tune the compression. It might be tedious
but considering that the majority of transactions based on XML documents
usually evolves into a finite family of types, it can be worth it. Consider
it one more example of "front-loading the pain".
>(Of course, you would have to convert the
>proprietary data type to XML to give XMill something to work with, and
>thereby be embedding some hints about the structure of the data, so you are
>right about user involvement)
In context of my original question, that is OK because XML is what I care
about coming out the gate. That I need to XSL or persist XML is fine
because I am assuming that (perhaps naively). I leave the issues of
the proprietary data types to another day. My interest is in a general
purpose compression for XML if WBXML is not that.
>Text-based XML was fast
>presumably because of gzip-based compression on the wire. Since XMill
>essentially uses gzip for the heavy lifting, you could expect the
>burden to be just about the same, but with better compression.
Ok, so this augments gzip by reorganizing/regrouping. I note they state
that even though compression is better, the performance is about the same
which contradicts the standing wisdom that better compression is bought
at the cost of performance.
>would have liked to see some COM and Java wrappers for XMill freely
>available (not just transmission; think about in-memory caching for
>expensive-to-create but infrequently used XML), but I gave up after getting
>bogged down in the spaghetti. I think researchers write such sloppy code
>a way to give the rest of us something to feel good about.
That is what I expect Microsoft to do or enhance. You build frameworks.
I apply them. :-) It may be that the XMill technique is general enough
that it is worth building components from scratch for release as part of
the Visual toolkits or open source.
Intergraph Public Safety
Ekam sat.h, Vipraah bahudhaa vadanti.
Daamyata. Datta. Dayadhvam.h