xml-dev - Re: SAX2: Interning names in namespaces

Re: SAX2: Interning names in namespaces

[ Lists Home | Date Index | Thread Index ]

From: terje@in-progress.com (Terje Norderhaug)
To: "xml-dev@xml.org" <xml-dev@xml.org>
Date: Fri, 4 Feb 2000 18:33:32 -0800

At 10:59 AM 2/4/00, John Cowan wrote:
>Terje Norderhaug wrote:
>
>> For each namespace, the parser may use a weak hashtable indexed on
>> equality.
>
>I think that requiring weak references is unreasonable for a cross-platform
>API.

Agree. It was meant as an example to demonstrate how it *could* be implemented.

My proposal is in essence that SAX2 establishes the following invariants:

1. Two equal names are ALWAYS identical (==) if they are in the same namespace.
2. Two equal names are NEVER identical if they are NOT in the same namespace.

These invariants allow a wide range of implementations, including using
hash tables as in my example. The names can be maintained as strings (which
seems to be desirable by many), but other implementations are also possible
without breaking the invariants.

>> The implementation results in that each namespace has its own copy of the
>> name.
>
>I agree that this would be a useful property, and I suggest that it
>be exposed as a standard SAX2 feature, perhaps named
>"names-interned-namespaces".
>
>> Parts of the namespace handling in SAX can be simplified if it can assume
>> that the parsers interns each name in its own namespace. It can eliminate
>> the need for passing namespace information as a separate argument to
>> methods or encoded in the name string.
>
>Unless there is a mechanism for recovering the namespace name from the
>name, this will not work well, as application behavior will depend on
>particular namespace names.

Given that the invariant above holds, the namespace can be recovered from
the name by maintaining a hash table indexed on *identity* with the name as
key and a namespace as value. A name/namespace association can be added to
this hashtable when the name is first interned in a namespace.

The crucial detail is that the hashing is on identity instead of equality,
which means that two equal but not identical names are hashed differently.
That is, the hash key is generated from the reference to the name, not the
characters in the name. Hash tables indexed on identity are much faster
than hash tables indexed on equality.

Note that this is just an example of an implementation for recovering
namespaces from names. Parsers have many alternative ways to achieve the
same. I suggest that SAX2 limits its recommendation to an eventual
interface for recovering the namespace from a name, without describing any
implementation details.

-- Terje <terje@in-progress.com> | Media Design in*Progress

   Software for Mac Web Professionals at <http://www.in-progress.com>
   Take advantage of XML with Emile, the first XML editor for Mac!

Prev by Date: Re: First/last call for WWW9 XML Dev Day 2000.05.19
Next by Date: Re: First/last call for WWW9 XML Dev Day 2000.05.19
Previous by thread: Re: SAX2: Interning names in namespaces
Next by thread: Re: SAX2: Interning names in namespaces
Index(es):
- Date
- Thread