Lists Home |
Date Index |
thanks, and interesting. Actually I noticed that Word -> XML eating up
disk, which is very true for large word documents (we ignored that point
in previous emails).
I guess I agree with necessity to define standard binary XML format, and I
did hear that it has generated sufficient interest that there is a focus
group created for that within W3C, if I am right..
good reasoning, and thanks for the mail.
On Tue, 18 Nov 2003, Bob Wyman wrote:
> The easiest way to encourage them not to define a *custom*
> binary format is to make it easy for them to use a *standard* binary
> format that interchanges well with code designed to process XML. Of
> course, what I'm suggesting is that by supporting the use of ASN.1
> defined encodings we can get binary/XML interchange and preserve our
> investment in tools like SAX, DOM, etc. This is because the ASN.1
> defined encodings (BER, PER, etc.) provide *lossless* encoding of XML
> data in highly compact binary forms. (I wouldn't be surprised, for
> instance, if a typical word documents ended up being less than 10% of
> the original size if encoded in PER...)
> I must say that I'm very concerned about the impact that Word
> documents in XML are going to have on the whole "XML movement." While
> people have grumbled about XML's size for a long time, most people
> haven't been exposed to the elephantine monsters that result when you
> convert Word to XML. ("The Word document that ate my disk...") This is
> going to start a whole series of companies and projects that will
> "address the problem of XML storage compression."... Unfortunately,
> most of those projects will be describing XML as a bug that needs to
> be fixed. The "bad press" for XML will not be pleasant and may give
> folk like Microsoft the cover they need to define "more efficient"
> *custom* formats in the future. I personally feel that it is vital to
> the "XML movement" that we be aggressive and accept existing,
> standard, ASN.1 defined encodings as binary peers of XML in order to
> address the storage space issue immediately.
> Microsoft has defined their schema using WXS. However, since
> the mapping from WXS to ASN.1 is defined (X.694), that means that by
> releasing they WXS, they have defined more than the XML schema for
> Word -- they have also defined the ASN.1 schema for Word. At this
> point, providing compressed binary support for Word documents is a
> trivial matter.
> There is actually an interesting opportunity here... If one or
> more of the recent "open office" products were to provide
> XML-compatible ASN.1 encoding support, then they would be able to
> argue that they have all of the benefits of the XML encodings that
> Microsoft supports while also addressing the needs of customers by
> providing highly compressed binary encodings. So far, all the
> Office-alternatives, have been limited to arguing that they are "just
> as good" as Microsoft's Office. If they were to embrase compact binary
> encodings that are XML-compatible, then they would be able to argue
> that they are "better."
> In summary: If you want to avoid *custom* binary formats,
> ensure that *standard* binary formats are available and supported.
> bob wyman