OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   SAX2: Namespace Processing and NSUtils helper class

[ Lists Home | Date Index | Thread Index ]
  • From: David Megginson <david@megginson.com>
  • To: XMLDev list <xml-dev@ic.ac.uk>
  • Date: Wed, 15 Dec 1999 09:54:12 -0500 (EST)

OK, back to SAX2 for now.  I'm doing some serious projects with RDF
and Namespaces right now, so I've done a lot of thinking about how we
can make SAX2 Namespace processing both efficient and
backwards-compatible.

I'm pretty sure that the best choice is to use James Clark's
{URI}localpart notation for Namespace-qualified names, so that an
XHTML <p> element (for example) will be reported as
"{http://www.w3.org/1999/xhtml}p".

Unfortunately, that creates some potential inefficiencies, especially
for Java, which is painfully slow at string processing (compared to
C/C++).  To work around this problem, I've designed a new SAX2 helper
class, NSUtils, with the following static methods:

  public boolean isQualified (String name)
  public String [] splitName (String name)
  public String joinName (String uri, String local)

The first of these is very simple -- it just checks whether the first
character is '{' (as it always must be for a qualified name).  The
other two, however, use static hashtables to cache their work, so that 
they're pretty efficient to call over and over again.

For example, the first time you call

  splitName("{http://www.w3.org/1999/xhtml}p")

the method will use java.lang.String.indexOf and java.lang.String to
pick out the URI part "http://www.w3.org/1999/xhtml" and the local
part "p" and will return them as a two-member String array, which it
will also store in a Hashtable.

The next time (or 1,000 times) you call

  splitName("{http://www.w3.org/1999/xhtml}p")

the method will find the string already in the hash table and will
return the same two-member array that it returned last time (or should
it be a copy?  I wish Java had const) without repeating any of the
expensive string operations.

I use a similar approach for joinName(), which makes writing a
NamespaceFilter extremely efficient.

Does this sound like a reasonable approach to the Java-heads out
there?  I'll send the source out in a separate message, since it's
only three screens or so.


Thanks, and all the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS