[
Lists Home |
Date Index |
Thread Index
]
Tahir Hashmi wrote:
> Robin Berjon wrote:
> In the first group, there could be a subgroup that doesn't need binary
> markup but may use it simply because it can, without affecting the way
> its applications work. That's the group that doesn't need human
> read/write-ability for its XML docs - the group of WYSIWYG Office
> suites, XML-based instant messaging protocols and so on.
I would quite seriously oppose using binary infosets when you don't need them.
It adds to the complexity of the system and removes a variety of features of
XML. Office suites can (and in fact do) use zip (if only because it doubles as a
packaging format with is very convenient for attached files such as images). XML
IM either needs binary infosets for performance reasons, or doesn't and
shouldn't use it.
> Consider this: the application is only interested in strings for date
> but the schema designer specified a date type because it is the Right
> Thing(TM) for a date (so that the schema need not be changed if at some
> point of time the same application or another application does get
> interested in the value).
>
> In a binary representation, the processor will decode the variable
> length binary value to arrive at the number of seconds since epoch,
> then re-construct a string for the application. Note that the
> processor will be *synthesizing* a string that could be read straight
> off the document.
>
> This approach would be better only if the benefits of saved bandwidth
> are greater than the cost of synthesizing the date string. And we
> can't assume that limited bandwidth is *always* going to be the
> motivating factor for using binary markup.
That's why in BinXML you can specify how you encode your data. In the case you
cite one would simply ask that the xs:fooDate type use the UTF-8 codec.
> The particular example I gave is illustrative only and as stated
> earlier, I'm not against type-awareness. I'm simply being wary of how
> much flexibility might possibily be lost, and in some cases
> computation be wasted, in the quest of a super-optimized binary
> encoding.
Again, if you don't want something encoded just ask the application to not touch
it :)
>>As for your remark on the speed of decompaction, note that you may be right for
>>a naive implementation of the same thing but there's compsci literature out
>>there on making such tasks fast.
>
> Well yes, naivete may lead to bad design. The point is that more the
> logic that goes into decoding a format, the higher the bar for small
> devices is raised. While one can have small non-validating SAX parsers
> for XML, the size of a binary format parser may go up since it would
> have to know about synthesizing dates from integers, deducing document
> structure from the schema etc, besides the indispensible passing of
> strings around. The encoding scheme should require least possible
> context information and minimal parsing logic to be accessible
> there. Hope I'm able to explain myself better this time!
It all depends on what you need. I totally agree that there is no
one-size-fits-all but I do believe that it is very much possible to produce a
flexible format that can be configured in a variety of ways, without it loosing
internal coherence. If you want a tiny and ultra fast decoder you can drop
support for encoding of the more complex types, if you want a slightly larger
decoder but the smallest possible payload you add codecs to encode the content
optimally.
--
Robin Berjon <robin.berjon@expway.fr>
Research Engineer, Expway http://expway.fr/
7FC0 6F5F D864 EFB8 08CE 8E74 58E6 D5DB 4889 2488
|