[
Lists Home |
Date Index |
Thread Index
]
> > I wonder what XPath expressions would cause an "XPath
> processor" to sweat?
> > Any "XPath processor" developer out there?
>
> Howdy, I'm Bob, one of the developers of Jaxen (http://jaxen.org/), a
> 'universal' XPath engine for Java.
>
> The biggest issue we've had (it's Today's Big Issue) is regarding
> document-ordering when using a union expression.
>
> $foo/bar | $cheese/melty
>
> XPath says that nodes should be returned in 'document order', which
> becomes a non-trivial in the case of some expressions involving
> the union operator.
That depends on your data structures, of course. Saxon's native tree
structures (both of them) are optimised for this operation. The standard
tree structure stores a serial number in each node, the "tinytree" (which is
now the default) stores nodes in an array, in document order. These
structures both exploit the fact that the tree is immutable, and both make
sorting into document order trivial.
Saxon also has a driver allowing access to JDOM trees. With this data
structure, sorting into document order is indeed painful: though if you
optimize for common cases, such as all the nodes being siblings, it's not
too bad. Saxon of course goes to great lengths to avoid the need for a sort
when it knows the nodes are already in document order, as they will often
be, for example with path expressions such as chapter/section/@title; and a
union is done as a merge operation on sorted operands.
And a point of detail, which I think Evan Lenz already commented on: XPath
1.0 doesn't say the nodes must be sorted in document order, and there are
many cases where sorting is unnecessary. For example, many operations only
require selection of the node that is first in document order, which can be
found without doing a sort.
Mike Kay
|