xml-dev - Re: SAX2: Interning names in namespaces

Re: SAX2: Interning names in namespaces

[ Lists Home | Date Index | Thread Index ]

From: terje@in-progress.com (Terje Norderhaug)
To: xml-dev <xml-dev@xml.org>
Date: Tue, 8 Feb 2000 16:02:16 -0800

At 1:59 AM 2/8/00, Stefan Haustein wrote:
>Terje Norderhaug wrote:
>>
>> A "programmer" understands the implications of the namespace recommendation
>> and are savvy in object oriented programming. Programmers are free to use
>>
>>   if (name == myPrice) ....
>>   else if (name == myQtyty) ....
>>
>> Unless names are interned in namespaces,
>> specialists would be limited to the less efficient techniques of beginners
>> and scripters when using SAX.
>
>Assume we have m namespaces and n elements each, and the SAX java
>intering feature is activated. Your "programmers" code needs up to n*m
>comparisons to find the right element. The "scripters" code
>needs up to n+m comparisons only, so I would not call your "programmers"
>code more efficient.

The "programmers" can choose whatever implementation is most favorable.
Thus, even in a worst-case scenario like you describe the "programmers"
code will always be at least as efficient as that of the "scripters".

However, using a n*m sized if/else construct would be a rather clumpsy
implementation. A switch/case  would be more readable and allows the
compiler to optimize for speed by reducing the number of comparisons.
Besides, a smart compiler may recognize the constants in the if/else and
anyway compile it as if it was a switch instead of as a sequence of n*m
comparisons.

Ensuring a unique instance of each element type particularly benefits
processing and filtering of markup. It allows advanced processors to use
techniques that wouldn't be feasible or efficient enough otherwise.

Let me illustrate some advanced markup filtering techniques by using a real
world example. I have a filter called the Stylizer [1] that emulates a CSS
style sheet for the benefit of legacy browsers. The Stylizer adds
presentational tags to HTML at serving time based on a style sheet. It
works by receiving startElement and endElement events, checking the style
sheet for matching selectors, then add attributes or call startElement
methods to insert presentational markup in the event stream.

The Stylizer uses different processing depending on the element type. I
implemented this by specializing the startElement method on the unique
interned instance of the element name. For example, the Stylizer has a
startElement method that is specialized on the BODY element type so that it
only is activated when the element type matches BODY. When dispatched, the
startElement method for BODY looks up the appropriate selector in the
internal representation of the style sheet (e.g. "BODY"), adds new
attributes for style properties like color, and eventually calls
startElement one or more times for adding presentational tags like CENTER.

A related technique is to use a hashtable to dispatch various processing
rules depending on the element type. The hash table is populated with
processing rules, possibly loaded at compile/runtime from a style sheet or
other external definition. At runtime, the startElement method looks up the
element type in the hashtable and processes the event based on the rule.
Given namespace interned element types, the hashtable lookup can be very
efficient as the hashing is on identity. Without namespace interned element
types, one might be required to maintain a separate hash table for each
namespace, and lookups will be much slower as the tables have to be keyed
on equality.

[1] http://interaction.in-progress.com/components/stylizer

-- Terje Norderhaug <terje@in-progress.com>

President & Chief Technologist
Media Design in*Progress
San Diego, California

Software for Mac Web Professionals at <http://www.in-progress.com>.
Take advantage of XML with Emile, the first XML editor for Mac.

Prev by Date: Full Open Source release of Quick 1.0
Next by Date: Re: The Power of Groves
Previous by thread: Re: SAX2: Interning names in namespaces
Next by thread: Symposium Invitation
Index(es):
- Date
- Thread