Lists Home |
Date Index |
- From: Tyler Baker <firstname.lastname@example.org>
- To: David Megginson <email@example.com>
- Date: Thu, 04 Feb 1999 19:22:54 -0500
David Megginson wrote:
> Tyler Baker writes:
> > If SAX were to make a simple requirement that all strings that
> > represent symbols (like names) were to be interned then things
> > would be a lot cheaper. The same can be said of the DOM as well.
> The problem is that Java's own intern is so terribly inefficient that
> no serious parser writer will use it (most of them have their own,
> custom interns).
As of JDK 1.1.6 things are not so bad and Java 2 is a bit better as interned Strings are under
the hood managed using Weak References. It could be made better in the JDK though. I suspect
if they made a real effort in the Java 2 JVM they could make string interns at least twice as
fast as things currently are. Nevertheless, string interning is a one time cost so lets put
that in perspective here.
> Even then, you wouldn't get any help with the "xmlns:" prefix
> matching, which is the costliest part. The most efficient way to do
Very true (ouch, ouch, ouch)...
> namespace processing is directly in the parser (which has to look at
> every attribute name anyway), but my own tests have shown that filter
> layer on top of SAX isn't too bad.
Unfortunately as in the case with all XML or XSL benchmarks, the test data can vary
enormously. If you have documents that have few elements with attributes (except of course
namespace attributes), then things probable will not be so bad. However, if you have lots of
attributes in elements, then you need to check every single attribute to see if it starts with
"xmlns:" (ouch, ouch, ouch).
So I suppose we should no encourage document designers to model data only as character content
in elements and only use attributes for ID's and namespaces declarations.
For types like a rectangle, I think using attributes makes a lot more sense in the general
case, but in the presence of "Namespaces in XML" I would change things from:
<Rectangle x="0" y="1" width="59" height="23">
The really sad thing about this is that there tends to be a feeling among a lot of people that
meaningful prefixes do not matter at all. If XML is ever going to be editable by an average
internet user for some common tasks, meaningful prefixes do matter.
xml-dev: A list for W3C XML Developers. To post, mailto:firstname.lastname@example.org
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:email@example.com the following message;
To subscribe to the digests, mailto:firstname.lastname@example.org the following message;
List coordinator, Henry Rzepa (mailto:email@example.com)