OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   re: SAX2 and Symmetrical Treatment of Data

[ Lists Home | Date Index | Thread Index ]
  • From: David Megginson <david@megginson.com>
  • To: xml-dev@lists.xml.org
  • Date: Fri, 22 Sep 2000 14:02:12 -0400 (EDT)

Alex Milowski writes:

 > In the ContentHandler interface, there is a method called character()
 > which allows the processor to pass the character data that is a child
 > of an element to a processing application.  If you introduce XML Schemas,
 > this allows one to create a streaming type factory to construct the
 > actual type instance without having to first instantiate a Java
 > string--which is very good from an optimization standpoint.

Yes, although Java Strings are much more efficient than they used to
be, at least in the Linux VM's.  I remember running some tests a
couple of years ago when Tim Bray suggested that string allocation was
expensive, and the overhead of allocating thousands of strings turned
out to be negligible.  I think that JDK 1.1 must have fixed some
problems there.

 > Unfortunately, the same concept does not exist for attributes.  An
 > attribute's value is already been constructed into a Java string before
 > the application can receive the lexical representation.  This seems rather
 > unforunate for XML Schemas and optimization since the typing of "leaf
 > nodes" within an XML document is uniform for attributes and element child
 > content.

This was a matter of much discussion during the original SAX 1.0
design, and most people preferred it this way.

 > Is it too late to fix this?  This would seriously help in building
 > optimized XML Schema aware processors.

Yes, it's too late to fix this, at least for now -- I intend a bug-fix
release soon, but no major API changes for a while (except extensions,
which are outside the SAX2 core).  I'd be interested in seeing some
profiling data to see how much the string allocation is actually
costing.

Note that a parser (though not a filter, obviously) could perform lazy
allocation of strings -- that might help a bit.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS