OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: SAX: String Internalisation and a CORBA/DCOM Question

[ Lists Home | Date Index | Thread Index ]
  • From: James Clark <jjc@jclark.com>
  • To: Xml-Dev List <xml-dev@ic.ac.uk>
  • Date: Sun, 19 Apr 1998 12:28:28 +0700

David Megginson wrote:
> Here's another last-minute SAX question: should org.xml.sax.Parser
> expose a method for internalising strings?
>   public abstract String intern (String s);

Absolutely not.

> Most Java-based parsers, at least, already use some type of
> internalisation (but not, usually, the inefficient
> java.lang.String.intern() method) for names -- the SAX driver could
> expose this functionality if support is already there, or do its own
> internalising if support is absent.

That would be a significant performance hit on SAX use with parsers that
don't do internalisation.  XP does not do this sort of internalisation
because it would make it slower.

> As someone has already pointed out, internalised strings will make a
> dramatic difference for the speed of applications, since applications
> can use a simple '==' operator (or the local equivalent) to test for
> equality rather than a slow subroutine like java.lang.String.equals().

Doing lots of comparisions on the type of each element whether using
equals or == is not a good way to write an efficient application.  It's
typically better to have a hash-table that associates each element type
with either an integer (which you can then use in a switch statement) or
an object (which you then make a method call on).

This could be done a little more efficiently with help from the parser. 
For example, you could have a method on SAXParser

  setElementTypeUserData(String elementType, Object userData);

Then startElement() and endElement() in SAXDocumentHandler could have an
additional Object userData argument.

This would allow apps to do something like:

void startElement(String name, Object userData, SAXAttributeList atts) {
  switch (((Integer)userData).intValue()) {


void startElement(String name, Object userData, SAXAttributeList atts) {

I don't think it's worth the complexity.

> By the way, here's the minimum list of what should be internalised in
> the callbacks from the SAX parser:

SAX should not require the internalization of anything.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS