[
Lists Home |
Date Index |
Thread Index
]
What you are describing is close to what some proposals involve. You
make a few assumptions, about not being able to update for instance,
that are not necessarily true. The complete set of features
(properties), detailed fairly well in the Binary XML working group
documents, provides the design context for this.
If "we" want to end up calling this "transport XML" or "transport DOM"
or "holistically optimized DOM" (because it isn't just about transport,
it's about the holistic overhead between one in-memory representation
and another accross the "wire"), that's just fine by me. You are
talking about an interchangable, standardized format that is usable
between architectures, development environments, and various
applications, in other words a general purpose binary data format. The
problem with not signaling it's semantic place in the world with "XML"
is a marketing problem and potentially a control and compatibility
problem. A number of us want it to be associated with XML so that we do
have equivalence to a high degree and an understanding of
applicability. If it evolves independently, it may diverge. (That's
not to say that I don't think there are additional semantics that make
sense primarily in an binary world.)
BTW, the "cache them if they are standardized" directly maps to the
Deltas property, something that I advocate. Additionally, you may want
to think about the possibility that you could create a format similar to
what you describe that can be randomly accessed incrementally without
first parsing or decoding. An incremental random access format will
beat any parser/decoder on total time of access to a few fields of a
large, complicated instance.
Then consider whether a format could support relatively efficient
updates without parsing / decoding / encoding / serialization. I can
prove easily that a format could support that; it is more difficult to
verbally prove compactness at the same time. As I've said, it is
important to consider what constraints you need to support in what
combinations and find ways to unify the solution. This is completely
different from adding bells and whistles.
I think that most instances of worry about updating other specs is a red
herring. For most of them, you can just declare XML 1.x equivalence for
operations involving those specs. That doesn't even mean you have to
convert to XML 1.x to be conforming, only that you operate "as if" that
were the case. (A la ANSI C standard in places.)
sdw
Rick Marshall wrote:
> my 2c on this (again...)
>
> 1. there seems to be use cases where something other than "text" xml
> would be useful while still preserving the infoset
> 2. xml, per se, doesn't know about types (xsd does...) so anything
> that assumes "x" is a number has to be lossy
> 3. as far as i can tell most use cases are about transporting xml
> inside an application. eg sending to mobile phones, printers, etc - in
> which case there are better ways to do this
> 4. instead of a "binary" xml format - what about a "dom transport"
> format? this could be dense and lossless:
>
> header - element dictionary, attribute dictionary, processing
> instructions (can't avoid sending these as text)
> body - rle encoded infoset - element code/num attributes/attribute 1
> code/length/data/..../element length/data - and it's recursive (do you
> want the bnf?)
>
> in terms of bits on the wire this wouldn't have to based on 8 bits, it
> could use the good old msb=1 coding to get better compression
>
> yes, this is a variation on gzip etc. the important difference is that
> the message is parsed before transmission and as a transmission ONLY
> format it does not have to allow for updates etc. it can be fed
> straight into a dom or sax tool without parsing overhead or decoded
> back to xml for storage manipulation. as a byproduct our mobile
> friends win on transport bits and processing cycles. a closed
> application could potentially even drop the headers (or cache them if
> they are standardised).
>
> now theres a market for:
>
> 1. standard dom representation (?)
> 2. dom transport encoder
> 3. dom transport decoder
>
> and in true open standard tradition companies can produce middleware
> for each of these.
>
> there are a few flies in the ointment - mostly to do with java, but we
> can work on those. (just for the record i live in c world)
>
> rick
>
> ps what i'm really saying is the effort should NOT be in binary xml
> (that implies update etc and a whole set of parallel or modified ws-*
> etc), but i think there is there is a compelling case for an xml
> transport format.
>
> Michael Champion wrote:
>
>> On Apr 7, 2005 12:37 PM, Elliotte Rusty Harold
>> <elharo@metalab.unc.edu> wrote:
>>
>>
>>
>>> Far too much effort has been spent shouting supposed self-evident
>>> truths
>>> about how much faster/smaller/sexier binary formats are. Little in the
>>> way of evidence has been produced,
>>
>>
>> As someone who has been (personally) receptive to the idea of "binary
>> XML" technology and possibly even a single standard IF it could cover
>> a wide range of cases , I must say that this was my biggest
>> disappointment when reading the XBC final document. They developed
>> decent use cases, attributes to measure, and measurement
>> methodologies, and collected a number of prototype formats and
>> implementations, but didn't put them all together and generate some
>> numbers. Rather than those Yes/No entries in the table for how the
>> formats met the criteria, someone could actually run tests and plug in
>> real numbers. I can only assume that those who know what the numbers
>> would be also know that they do not support the case that a single
>> "binary XML" format can be both small enough and fast enough (not to
>> mention the 15 other attributes!) to make much of a difference.
>> Otherwise, why not publicize them and address the concerns that the
>> TAG expressed?
>>
>> I fervently hope that the W3C doesn't move forward with a followon WG
>> until that quantitative evidence is available to all concerned. Until
>> then, we're back where we started - we know that some formats are
>> dramatically faster OR smaller than XML for some set of use cases, but
>> we don't have any reason to believe that one format can do it all
>> across use cases, or even that one could hit a reasonable 80/20 point
>> compared with XML. The idea of forming a WG to do some computer
>> science here, trusting that an acceptable size/speed tradeoff is
>> possible while maintaining XML API/tools/datamodel/etc. compatibility,
>> is at best implausible at this point.
>>
>>
>>
>>> Personally, I'm not convinced that there isn't another factor of ten
>>> performance gain to be had in the world of real XML parsing, though
>>> doing that will probably require ditching DOM and SAX in favor of a
>>> more
>>> performance tuned API. Still, I doubt we've yet hit the end of the line
>>> when it comes to XML parsing algorithms. I could well be wrong about
>>> that, but it would be truly ironic if two weeks after binary XML
>>> goes to
>>> REC, some grad student somewhere releases a text parser that beats the
>>> pants off the binary parsers.
>>>
>>
>>
>> Noah Mendelsohn made that point in a very compelling way at the W3C
>> binary workshop in September 2003. He reiterated it on the TAG list
>> the other day, noting that the mainstream Java XML tools were designed
>> for conformance, not performance, and serious brainpower is only now
>> beginning to be thrown at the problem of efficiently parsing XML and
>> conveniently consuming the result.
>>
>>
>>
>>> I suspect this would be more likely to happen if companies like Sun
>>> devoted their brain power to XML parsing algorithms rather than
>>> inventing new formats.
>>>
>>
>>
>> I think it's important to focus on people/problems/projects and not
>> who they work for. MS as a whole is extremely skeptical about "Binary
>> XML" standardization, but nobody has decreed that Binary XML is Evil
>> and is off the radar as a solution. Sun as a whole seems receptive to
>> standardization, but I doubt if anyone there has decreed that Binary
>> XML is the One True Path and parser optimization is pointless. Almost
>> everyone I talk to on both sides of the philosophical divide (or the
>> .NET / Java divide) has an appreciation of the dilemmas, and I'll bet
>> there are material Day Job rewards waiting for clever solutions to
>> actual customer problems, whether they involve new formats, new
>> algorithms/APIs, finding an efficient subset of XML that hits the
>> 99/01 point for their use cases, whetever. The most important thing
>> IMHO is to not prematurely standardize on something we will all regret
>> in a few years, but not to prematurely reject options based on
>> preconceptions of one sort or another either.
>>
>> I do think that the burden of proof is on those who would break things
>> that actually work without compelling evidence that would make us much
>> better off down the road. To be fair to the XBC people, until we
>> really and truly see XML 1.0 working for the use cases that they laid
>> out, I'm planning to keep an open mind about how to make the XML
>> family of technologies work better for all the users and potential
>> users. For now, various "binary XML" technologies definitely have
>> their place in specialized domains and scenarios, but nobody has come
>> anywhere near demonstrating that a "binary XML" standard would do more
>> good than harm. For that matter, I don't see how it is plausible to
>> believe that a consensus can be reached on a format that trades off
>> the needs of the wireless people for cheap compression and the needs
>> of the enterprise messaging people for processing speed.
>>
>> -----------------------------------------------------------------
>> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>> initiative of OASIS <http://www.oasis-open.org>
>>
>> The list archives are at http://lists.xml.org/archives/xml-dev/
>>
>> To subscribe or unsubscribe from this list use the subscription
>> manager: <http://www.oasis-open.org/mlmanage/index.php>
>>
>>
>> !DSPAM:42557720148789716947119!
>>
>>
>>
>
>-----------------------------------------------------------------
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://www.oasis-open.org/mlmanage/index.php>
>
--
swilliams@hpti.com http://www.hpti.com Per: sdw@lig.net http://sdw.st
Stephen D. Williams 703-724-0118W 703-995-0407Fax 20147-4622 AIM: sdw
|