[
Lists Home |
Date Index |
Thread Index
]
On Thu, 31 Jul 2003 17:46:32 -0400, Tyler Close <tyler@waterken.com> wrote:
>
> For an example of a binary format that supports efficient string
> interning, without a penalty to generality, see:
>
> http://www.waterken.com/dev/Doc/code/
Very interesting point/idea. AFAIK much of the overhead of XML text
parsing that the binary infoset advocates complain about is in the Unicode
encoding/decoding and raw string processing (e.g, looking at every
character to see where an element ends rather than having a stored length).
Likewise, a number of alternative infoset serializations use the "stream
of SAX events" metaphor, that sounds a bit like what that document
describes.
But that doesn't sound like "string interning" to me (and "interning" is
not mentioned in that document). I thought "interning" was more of a
technique for keeping compiled code small by referencing redundant strings
via their hash values. That basic idea is certainly used in all sorts of
compression schemes ....but are you sure that's what the Waterken
serialization model is doing?
I suppose the Waterken(TM) stuff is IP-encumbered up the wazoo, eh?
|