OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Vinci, XTalk



This is another data point on the never-ending binary XML discussion,
with a bit on simplicity:

http://www.almaden.ibm.com/cs/people/bayardo/vinci/vinci.html

The Appendix defines XTalk:

"XTalk is a pseudo-binary XML format intended to make the XML parsing
task even more simple than what was originally envisioned by the XML
creators. It is not, however, intended to be a replacement for general
XML documents. Indeed, we expect textual XML to be the mainstay of
document exchange. XTalk is best used as an intermediate XML
representation exchanged by high-performance, distributed services that
run on anything and everything from the hand-held to the mainframe. The
representation may also be suitable for storage in persistent XML
stores.

We realize that any proposal for a non-textual document representation
may be met with considerable resistance, as it deviates from the primary
human readability criteria that has motivated specifications such as
SGML, HTML and XML from the start. Nevertheless, the need for
standardizing on a non-textual representation has been expressed
numerous times on W3C discussion lists and elsewhere, including one in
which Tim-Berners Lee has expressed support for the idea [B-L99]. XTalk
attempts to deviate from the human readability criteria as little as
possible by representing only structural aspects of the document in
binary, and leaving all data components in the standard UTF-8 character
format."

They've done some interesting homework, and built on the XPath model,
with some motivation from canonical XML.  They also found XML more
complex than they liked:

"Rumor has it that XML was intended to be a variant of SGML simplified
to the point where any DPH ("Desperate Perl Hacker") could write a
parser for it over one weekend. While XML is undeniably far simpler than
SGML, the reality is that it remains of sufficient complexity to make
parser implementation difficult -- so much so that large open source
efforts are dedicated to XML parser implementation [A00]. Another
side-effect of this complexity is that parsing XML requires significant
computational overhead, at least compared to the overhead of simple
services which may wish to communicate by exchanging XML documents."

Their results:
"To summarize, the main advantages of XTalk over XML are speed, size,
and simplicity. Speed-wise, we have found our XTalk parsers to provide
at worst a 3 times speed up over a hand-optimized, bare-bones XML
parser. In practice, when compared to full-blown XML parsers such as
Xerces [A00], the speedup is closer to 10 times or more. Size-wise, our
XTalk parsers are approximately two orders of magnitude smaller than
comparable XML parsers, and memory footprints a a factor of 4 times
smaller. For example, in PalmOS, our basic client, server, VinciFrame
document model and XTalk conversion library has a size of only 13K."

Interesting reading.  Thanks to Chris Genly for passing me the URL.

Simon St.Laurent
http://www.simonstl.com