OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why not reinvent the wheel?




Jonathan Robie wrote:
> Charles Reitzel wrote:
> >I am not buying the optimization argument either.  By exposing random access
> >XPath (pointed out by Joe English), there is effectively no difference
> >between them.
>
> Random access XPath? I think that Joe English pointed out that we allow the
> parent operator in XPath. I think that's a long ways from saying there is
> no difference between XSLT and XQuery for query optimization.

I wouldn't go that far either; the structure of XQuery does look
to me like it's more amenable to optimization than XSLT.
But the '..' axis is definitely problematic.

XSLT is widely regarded as unsuitable for processing gigantic
documents because (as common wisdom has it) it's necessary to
keep the entire document in memory, or use a (much slower) 
database-backed persistent DOM.  Whether this is true in fact 
is an open research question, but in practice nobody has yet come 
up with any other way to effectively implement XSLT, at least as 
far as I know.

Except for the '..' and '->' operators, XQuery constructs can only look
"down the tree".  This property would make it _much_ easier to
statically determine how much of the document needs to be kept
in memory.  In fact with lazy evaluation and a bit of care,
you can obtain good space usage almost for free (see for example HaXML).

But with '..', an end-user can define all the other XPath axes:

	-- modulo typos and type errors...
	FUNCTION ancestor(ELEMENT $e) RETURNS LIST(ELEMENT)
	{
		$e/.. UNION ancestor($e/..)
	}
	FUNCTION following-sibling(ELEMENT $e) RETURNS LIST(ELEMENT)
	{
		$e/../* AFTER $e
	}
	FUNCTION preceding-sibling(ELEMENT $e) RETURNS LIST(ELEMENT)
	{
		$e/../* BEFORE $e
	}
	FUNCTION following(ELEMENT $e) RETURNS LIST(ELEMENT)
	{
		following-sibling(ancestor($e))//*
	}
	FUNCTION preceding(ELEMENT $e) RETURNS LIST(ELEMENT)
	{
		preceding-sibling(ancestor($e))//*
	}

and we're back to the same situation as with XSLT: the implementation
has to provide unrestricted, random access to the entire input
document _just in case_ a user query calls one of these functions
or their equivalents.

(The '->' operator / 'id()' function presents a similar problem,
but this looks a bit more tractable to me than full XPATH.)


--Joe English

  jenglish@flightlab.com