OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   String interning (Was: [xml-dev] Binary XML == "spawn of the devil" ?)

[ Lists Home | Date Index | Thread Index ]

As other people have already remarked, performance comparisons
between a binary and textual format should not be based on message
size alone.

In some applications the actual operation to be performed is very
simple and fast. In this case, the time required to extract the
input information from the input document dominates. A binary
format can reduce the amount of time necessary to extract the
input information.

A binary format can efficiently produce a data model in which all
identifiers are interned. This optimization speeds lookup
operations as it is much faster to compare pointers than text
strings.

For an example of a binary format that supports efficient string
interning, without a penalty to generality, see:

http://www.waterken.com/dev/Doc/code/

For one application, the E project <http://www.erights.org/>, this
optimization was a primary reason in choosing Waterken Doc code
over competing formats.

I think this technique could be valuable in an XML binary syntax.
At the very least, it's worth considering the potential
performance gains.

Tyler




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS