[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
RE: [xml-dev] Is it time for the binary XML permathread to start up again?
- From: "Costello, Roger L." <costello@mitre.org>
- To: <xml-dev@lists.xml.org>
- Date: Fri, 20 Jul 2007 14:16:45 -0400
Hi Folks,
I have collected the various comments and organized them into the
appropriate slots.
When sending an XML document across the wire, here are the main
choices:
1. Send the XML document as is, as ASCII text, without any compression
or other alterations.
Adv: Processable by any XML tool, readable by the lowest denominator of
text editors.
Not just easy for human reading, but also for debugging.
Disadv: Lengthy, verbose, perhaps ok for machine-to-human, but when
used for
machine-to-machine and demanding apps such as real-time or
long docs, the inefficiency surfaces.
2. Compress the XML document using a compression tool such as WinZip or
Bzip, and then send the compressed document.
Comment: a compressed format is not in the same space nor purpose as a
binary format. A long (XML text) document can still be long in binary
format, but readily parsed efficiently by a binary parser, while a
compressed format aims solely for compression efficiency regardless of
parsing difficulty.
Adv: it helps in sending large files quickly by shrinking the size of
the file.
Disadv: overall processing from sender (encoding side) to receiver
(decoding
side) becomes longer (for given constant CPU speed) due to
extra
overhead of compression and decompression.
WinZip and Bzip reduces the size of the document on the wire but
usually increases processing time
Zip can be used with any XML document
3. Use the compression capabilities inherent in HTTP (gzip content
encoding, i.e. http+gzip)
Adv: The compression is done automatically and pretty much every web
server and client now supports this.
4. Encode the XML document as an ASN.1 BER, DER, or PER file, and then
send the ASN.1 BER, DER, or PER file.
Adv: ASN.1 has been around for over 20 years and has been used in many
standards in telecommunications and in other areas of technology.
ASN.1 requires that the two endpoints share a schema that describes
your XML document; the schema can be either an original ASN.1 schema
(one originally written in ASN.1 notation), or one that has been
automatically derived from an XSD schema through the standard X.694
mapping; in addition, ASN.1 requires that your XML document be
completely valid (it doesn't tolerate deviations from the schema); if
your application uses schemas and does not need to support anything
other than valid XML documents, then ASN.1 BER/PER can be a good choice
because it's very fast and compact
5. Encode the XML document a Fast Infoset file, and then send the FI
file.
Fast Infoset, which does not depend on schemas, usually achieves good
compactness and acceleration effortlessly, but its performance can be
improved (especially for short documents) if the sender of the XML
document has some (a-priori) knowledge about the document (e.g.,
partial or total conformance to a certain schema, or even just some
statistical properties of the XML document, for example a list of
expected frequent element names or attribute values or namespace URIs);
in most cases, performance improvements can be achieved without any
requirement that the receiver know anything about the document before
receiving it, but Fast Infoset also supports the use of an external
vocabulary (to be shared among the participants) in order to further
improve performance.
As a message encoding for communications, Fast Infoset is considerably
more efficient than text. If you have an infoset in memory and want to
serialize it for transmission then FI offers you compactness, better
performance and is nicely interoperable too. FI-encoded OVAL docs (for
example... :) are a quarter of the size of text-encoded OVAL docs. And
since the processing penalty of compression is proportional to doc
size, using FI instead of text makes sense even when doing http+gzip.
However if you already have a text-encoded file in your hands and just
want to send it to the other side, gzipping it is probably the best
option of all.
Fast Infoset can be used with any XML document
6. Encode the XML as an Efficient XML Interchange file, and then send
the EXI file.
EXI can be used with any XML document
/Roger
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]