Re: [xml-dev] "Efficient XML Interchange Measurements" draft made public

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: Robin Berjon <robin.berjon@expway.fr>
To: Michael Champion <mchampion@xegesis.org>
Date: Thu, 20 Jul 2006 15:24:40 +0200

Hi Mike,

On Jul 19, 2006, at 07:21, Michael Champion wrote:

The conclusion is "At the time of writing of this first draft, it is too early to give conclusions drawn from the test results. This draft Note is being published to encourage review and comment as this work continues." I see the results of a lot of hard work, and that work seems to be accomplishing some of the unfinished business of the old Binary XML Characterization WG. Still, I don't see anything to disconfirm my a priori belief that it will be somewhere between difficult and impossible to to develop a Binary XML format that covers a wide range of use cases and yields a useful degree of compression and speedup (and all the other properties).

Well, it does say that it's too early to give conclusions so why would you expect it to provide some? 1/2-)

The EXI WG has taken very seriously the feelings expressed by some that the argument needs to be made with extensive numbers and all, and that's what we're doing. There have been quite a few previous studies on the fact that EXI works, but most of them compared just a small number candidates (often two), looked into only a set of specific aspects (often only compactness and speed), worked on specific data sets, or on specific sets of use cases. Taken all together there's enough to build more confidence than I have in many specs that have been spawned (and no I won't reopen the XML Schema discussion here :), but there's been a clear message that folks are asking for more.

So we're trying our best to provide a much more comprehensive study, but it's arduous work. Getting all candidate formats to run in the same framework is trickier business than one would think, not because it's experimental but because it's software (case in point: we've hit problems with *gzip* and I don't think that's experimental in anyone's book). We review test documents, metrics (for both documents themselves and performance), results, and a bunch of other things. It's taking time because the complexity of benchmarking on this scale is a bunch more than the sum of benchmarking individual bits, but I think it's time well spent.

The first draft of the Note that was just released provides (in our opinion) enough information that people can start commenting on the methods, and on very preliminary results. It's not meant to be conclusive yet.

So what am I missing? Is there a bunch of result data that I'm just not finding a pointer to?

The WG has a lot more data than what is currently being provided because we're cross-checking it to make sure it's right. One thing that we're doing is that we're having the Measurements Note in TR space (where publication is slow and cautious) and a more up to date analysis that we tend to edit weekly in the public group space at http://www.w3.org/XML/EXI/report. Read it while taking all caveats into account, and don't hesitate to send comments.

The latest runs of the results is always pointed to from http:// www.w3.org/XML/EXI/test-report. Currently the reason why the analysis does not cover processing efficiency (and more importantly, its relationship to compactness) is because we're still seeing strange anomalies in the PE runs and are not completely confident in the quality of the results (a new batch ought to be available soon with fewer issues).

I don't know if I should just encourage folks to look at those pages on a regular basis, or if notifying the WG's public list (public- exi@w3.org) when there are updates is better. Suggestions welcome.

I realize this effort isn't complete, but what evidence can be gleaned from this to support an argument that the EXI WG is on track to discover or produce a spec that will meet its objectives spelled out in http://www.w3.org/2005/09/exi-charter-final.html ?

We're on track to meeting those objectives, knowing that said objectives are about demonstrating viability or the lack thereof. If there were sufficient evidence at the level which is required of us at this point we'd have published conclusions.

Specifically, they are supposed to be 6 months from a Last Call working draft of a standard EXI format; does anyone think that is likely to happen?

The benchmarks have a dual purpose: proving viability and picking the best candidate to start creating a spec from. I expect those two to happen at pretty much the same moment. If the spec for that candidate is good, getting to Last Call fast should be possible. That being said the current thinking is that it's best to be late with the LC if it means the benchmarking is better so some delay on that is not a huge concern. The benchmarking also cuts out a lot of work out of CR since we already have a framework for testing, a test methodology, tests, etc..

--
Robin Berjon
   Senior Research Scientist
   Expway, http://expway.com/

References:
- "Efficient XML Interchange Measurements" draft made public
  - From: Michael Champion <mchampion@xegesis.org>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]