OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: String interning (WAS: SAX2/Java: Towards a final form)

[ Lists Home | Date Index | Thread Index ]
  • From: Tyler Baker <tyler@infinet.com>
  • To: Assaf Arkin <arkin@exoffice.com>
  • Date: Mon, 17 Jan 2000 18:00:07 -0500

Assaf Arkin wrote:

> Tyler,
>
> I am aware of how to perform interning. I wrote OpenXML which performs
> interning for SAX and DOM, and I'm a contributing member of XML Apache,
> so I'm also familiar with their mechanism.
>
> Yet, aside from parsers and DOMs I use SAX in a variety of applications
> that do not perform String interning, nor is there any benefit for them
> to do so. I'm afriad that mandating interning will simply break these
> (and many other) applications.

It won't break SAX 1.0 because it is not a mandated feature. For SAX 2.0 implementations, these
applications will need to support the SAX 2.0 API anyways. Having interned String support
regardless of the application is mostly trivial, but the benefits at the application level can
be immense if performance is at all a consideration in your applications. Really it depends on
the size of your document. For web browsers, interning or not interning is no big deal because
the documents are not that large anyways. I/O is pretty much always your bottleneck and not the
parser, even if the parser is very inefficient.

> Also, both OpenXML and Xerces use their internal interning mechanism
> which is substantially faster than String.intern, especially for dealing
> with DOM and parsing, however, the following will never work in either
> OpenXML or Xerces:
>
> if ( tagName == "foo" )
>
> for the simple reason that their interning mechanism and String.inter do
> not share the same table.
>
> arkin

The entire point of using String.intern() is to make the application which uses the parser
framework faster and not in a way which makes you have to write code like this:

public static final String CONSTANT = GlobalStringInternTable.intern("foo");

As a developer I prefer to use the least number of proprietary hooks as I possibly can. Using
some GlobalStringInternTable I think would only make sense for namespace support if you had a
parser framework that presented the application with a Name object instead of three strings
consisting of the prefix, namespace, and local part.

I guess it is just an argument mostly about what you want the application developer to deal
with. For me I prefer the way that gives me maximum performance without any obtuse coding to
some proprietary string table interface.

Regards,

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Please note: New list subscriptions now closed in preparation for transfer to OASIS.






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS