OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] XPath 1.5? (was RE: [xml-dev] typing and markup)

[ Lists Home | Date Index | Thread Index ]

Hi Jonathan,

> Static analysis never costs anything at run time. Static analysis
> can also be used to rewrite code to be more efficient at run time.
> So static typing, the typing people seem to be complaining about the
> most, never degrades performance, and often helps performance.

I think that David's point about the differences between how XQuery
and XSLT are used is pertinent here. XSLT stylesheets tend to be
compiled afresh each time they are used, whereas (possibly) XQuery
queries can be compiled once and operate over the same set of data
multiple times. I think that the reason people think static typing
degrades performance is because in XSLT it's usually *part of* run
time -- compilation and execution are part of the same process.

Certainly there are cases where stylesheets are compiled once and used
multiple times, particularly on servers. However, that's not a
plausible model with client-side transformation, which is one of the
growth areas of XSLT.

> Dynamic typing has a more direct influence on run time performance.
> For any system that actually stores typed data, including relational
> databases and those native XML databases that use XML Schema,
> representing typed data directly in the XML view is probably less
> overhead at run time. If you are going to use data with typed
> operations such as arithmetic operations, then an untyped approach
> means you need a run-time cast, which is not needed if your data is
> typed. A system that knows the datatypes associated with data can
> use that information to optimize the way it is stored and used.

There's a distinction, in my mind at least, between optimisation based
on simple types and optimisation based on complex types, and more
subtly on optimisation based on built-in types and optimisation based
on user-defined types.

I can perfectly well see the argument for optimisation based on
built-in simple types. If you look at the implementations of XPath,
Expression objects usually have "evaluateAsBoolean" values, for
example, which enable you to make shortcuts such as not retrieving an
entire node set when you only need to check if it contains more than
one node. That's great, and I wouldn't do without the speed increases
it brings.

But from what I can see, the same kind of operation on user-defined
types, particularly on complex types, is going to be a lot harder. I'm
not an implementer, but I imagine it would take a lot of work during
compilation to create classes for different kinds of elements so that
you can take advantage of their particular features, such as testing
whether an element can have a particular child before trying to
retrieve it. The reason it's worthwhile doing this for the built-in
types is precisely because they're built in.

I think it comes down to what advantage it gives you to treat a 'foo'
element as a 'foo' element rather than a generic element node, and
whether that advantage is worth the cost of schema analysis and the
extra time and memory it takes to have fooElementNode objects, bearing
in mind that the compile time and run time costs have to directly
offset each other with XSLT.

If it ends up being roughly equal, or if the analysis time is greater
than the time you save due to optimisation, which is what I suspect,
then the question is what's the point of the strong typing for complex
types? Especially as there are lots of *disadvantages*, such as the
added complexity in the processors and in the spec to deal with all
the different kinds of casting and validating of complex types.



Jeni Tennison


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS