[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
RE: [xml-dev] I have implemented SAX based XPath Engine
- From: "Michael Kay" <mike@saxonica.com>
- To: "'Santhosh T'" <santhosh.tekuri@gmail.com>
- Date: Fri, 20 Feb 2009 14:15:06 -0000
I hope you don't mind me asking some more questions...
MK> I note that you support multiple downward selections in a predicate, for
example
/root/pub[book/name and book/author]//book
MK> I would be interested to know whether you do this in a pure streaming
way, or whether you build an in-memory tree for any element that has such a
predicate. (Saxon's streamable subset of XPath currently doesn't allow
multiple downward selections).
ST> Yes. It is done using pure streaming way. There is no in-memory tree for
any element is created, otherwize it would defeat the main purpose of
XMLDog.
ST> In this example, when pub startElement is called, I catch the path of
element say /root[1]/pub[5]. I call this as delayed evaluation which needs
to qualified by the evaluation result of its predicate. the predicate
[book/name and book/author] would be evaluated by the time i get endelement
of pub. so when endelement of pub is notified, i know whether
"/root[1]/pub[5]" has passed the predicate or not.
MK> OK. So let's change the query to
//book[author = editor]/price
Which of course is true if any of the book's authors has the same name as
one of its editors. You don't know the order or cardinality of the children
author, editor, and price, so I assume you are "remembering" all the author,
editor, and price children until you hit the end tag of a book; you're then
evaluating the predicate, and if it's true, you output all the price
children?
How much do you "remember" about the price children? There's a question mark
here because you don't really know what information the user wants about the
price elements: they might want the string value, or the attributes, or
perhaps the children,...
For author and editor, I guess the minimum that needs to be remembered is
the string-value of each author and editor child, and you can claim to be
"pure streaming" if the only memory you allocate is enough to hold these
values?
ST>XMLDog do supports absolute paths in predicate also. for example:
/*/fibonacci[ count(/*/fibonacci) - 1 ]
Does that involve more than one scan/parse of the input file? If not, how is
it done?
MK> You've got an example that does *[last()]. Do you allow *[last() - 1]?
Is this done with a pure streaming approach?
ST> Yes. it supports [last()-1] also;
MK> How do you do this? With multiple passes over the data? Or with a
lookahead buffer?
Thanks!
Michael Kay
http://www.saxonica.com/
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]