OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "Binary XML" proposals

"W. E. Perry" wrote:
> The savings to be realized through the use of a binary format are premised upon
> parsing the XML text only once and thereafter passing around or storing the binary
> encoded output. Such a mechanism demands that every user of that data expect, or
> accept, the identical output of that parse--effectively, a canonical rendering. It is


> only such unanimity which would permit every user to accept the product of a parse
> performed by any of them. In the rapidly growing internetworked universe, it is
> precisely that unanimity which we cannot reasonably expect, because the fundamental
> premise of accepting node-to-node opacity as the price of universal node-to-node
> addressability is the exchange that underlies the internetwork topology. It is our
> good fortune that XML appeared just as the number of these mutually-opaque but
> mutually-addressable nodes is furiously increasing. I argue that the reasonable
> understanding of XML acknowledges that every use of an XML document begins with a
> fresh parse of that document in the context of that use. That parse is not required to

Close, but I disagree that it's a fresh 'parse' per se; rather 'parse'
could be replaced by 'traverse', when needed, that would allow a buildup
of domain specific metadata.  In addition, in some environments it
actually makes sense to add that metadata or interpretation to the data
itself.  A really great example I've seen of this is where a 'log' of
operations on a 'message' is actually added to the message.  In other
environments, the meta data isn't needed at all: in the case of simple
API usage, the 'message' carries a simple hierarchy of protocol members
that can often be used directly.  In the latter case, processing steps
may receive a very large XML document (I have experience with a 700+
node user state for a web application) and only be interested in a few
elements on a particular 'pass'.  Direct access to these without any
work for the other 700 is an obvious win.

Some applications might not benefit from bsXML, but many others would. 
In API/protocol/database use, you often just want: zip =
employee.homeaddress.zip, etc.

Even in a document environment, loading a large XML document into an
editor is time consuming and memory intensive as all the data is more
than duplicated to provide a data structure for actual processing.

> instantiate XML as XML--the document itself is already that instance--but to
> instantiate the particular objects which that specific use of the XML document expects
> and requires. Because of the uniqueness of the process which requires them, and of the
> unique circumstances at every execution of that process, the only way to effect the
> precise objects required is to instantiate them afresh (which, BTW, every use of the
> binary encoded XML representation would require anyway). The distinction is that XML

Wrong, at least in my proposal (in progress).  Everyone is thinking too
narrowly here.  Creating an analogous hierarchy is only required when
you have extensive metadata or operational data that you explicitly
desire to be absent the main tree.  Even there, I have a 'DOM Delta'
solution that makes this cheap.

> processing requires by its nature that what drives that instantiation be the parsing
> of XML, which is to say the lexical handling and, from it, the unique sememic
> interpretation of XML syntax. You may choose to drive that instantiation off of
> something other than XML syntax, but it is not then XML processing, and what you lose,
> most significantly, in doing that is the ability for the same text to be understood
> and usefully processed at the same time as something very different, but
> simultaneously the valid basis for a transaction between, utterly dissimilar users.

I believe I understand your reasoning, but I disagree on several points:

Traversing a binary equivalent to text based XML is still semantically
equivalent to 'XML processing'.

Whether tags, attributes, text, etc. are framed by text markup or a
binary structure is just an extreme form of syntatic sugar.  As a
minimal example, replacing <sale>45</sale> with: [8][4]sale[2]45 (length
counts) would 'parse' identically.  Of course at this minimal level,
it's not worth the headache.

Binary structures can be designed to be efficiently handled in multiple
environments.  Some of my constraints are specifically targetted at
allowing efficient direct processing in Java for instance.

> Respectfully,


> Walter Perry

sdw@lig.net  http://sdw.st
Stephen D. Williams
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax