[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: We need an XPath API
- From: Charles Reitzel <creitzel@mediaone.net>
- To: xml-dev@lists.xml.org
- Date: Wed, 07 Mar 2001 22:49:48 -0500
Thanks for the detailed response. Comments below.
take it easy,
Charles Reitzel
P.S. Just one copy now. I switched from the digest subscription that makes
occasional direct receipt convenient. But I don't want to spam anyone.
At 09:33 PM 3/5/01 +0100, Robin Berjon wrote:
>First and foremost, Charles, thank you very much for your summary, I
>believe it to be very useful.
Good to hear. Thanks for the encouragement.
>...
>Note though that I am *not* interest in low-level callbacks but
>rather in a factory interface so that I could build custom XPath
>objects.
Thanks for clearing that up. Then we are agreed: no callbacks are
necessary for XPath expression parsing. I like expression factories well
enough. In particular, they can let an application build a generic
expression and emit either the full or abbreviated syntax.
>There are several things that an XPath interface could provide.
>Mostly 1) a way to look into and manipulate an XPath expression
>for various purposes and 2) a way to query a DOM relative to a
>given XPath.
>
>Two interesting methods I can think of for 2) are
>$xpath->select_nodes($current_node) and $xpath->matches($node)
>(I'm using Perl style syntax for my examples, but I think it
>should be understandable). The above would presumably work on
>any DOM Node. However, I'd like to see a factory interface in
>order to be able to create XPath object optimized for different
>uses, for different tree models (potentially non-DOM), etc...
I understand the use case. I agree the expression needs to be separate
from the DOM-specific handling. Thus, the notion of a XPathDOMHelper
interface that will use XPathExpr and DOM Document/Element instances to
locate nodes and/or test for a match.
This division of labor allows the expression level interface to be reused
in an XPathSAXHelper without imposing DOM dependencies.
>... Using XPath for CSS selectors was brought up on www-style
>and rejected, for (imho) good reasons. The best reason is obviously
>that it'd break compatibility accross CSS versions,
>but that's of little interest here. ...
>
>The extent of the overlap that I believe would be useful to
>investigate in SAC is it's nice Selector/Condition model. That can
>imho be directly applied to XPath as Path/Predicate.
Agree.
>I'd go against using the interfaces directly. I started drafting
>an XPath mapping of SAC yesterday just to see how far it goes
>and a number of things came out differently.
You've convinced me.
In any case, the existence of a decent XPath expression interface would be
useful in building an XPath <-> CSS2 translator. But we needn't include it
in our current scope.
>I'd like to be able to construct a selector object totally
>abstracted from the selector syntax. I can do that with
>SAC's factories and I'd like to be able to do the same with
>the XPath interface, which is why I'm interested
>in factories.
I wouldn't oversell expression factories. They can help to bridge syntax
variations. E.g. abbreviated vs. full XPath. You are still dependent on
the underlying features. A successful example of this "pattern" is Rogue
Wave's db.h++, which has interfaces for generating SQL expressions and
concrete implementations for each of the major SQL vendors. It works well.
I think we need to support non-factory based construction as well:
XPathParserFactor xpFact = new JoeBobsXPathFactoree();
a) Parse Text
XPathExpr xpExpr = xpFact.createExpression( "//para" );
b) Selector Factory
XPathExpr xpExpr = xpFact.createExpression();
xpExpr.append( xpFact.createDescendantOrSelfSelector() );
xpExpr.append( xpFact.createElementSelector( "para" ) );
>Also, I think a factory approach would make sense
>considering that XPath is moving towards being able to
>use the PSVI. Perhaps other people would also like to see
>XPath have the capacity to select against all sorts of other
>infosets. To do so using factories, they would only need
>to define new XPath tokens that are grammatically valid,
>a few factory classes, and voila! an extended XPath that
>can deal specifically with their infoset. I believe that
>would make XPath interestingly extensible.
Yes, we need to strategize how to make an extensible base. The approach
you describe works well enough for *generating* a text representation. But
it doesn't help, as is, for parsing an existing text representation.
I think the XPath tokenization rules give all-but-explicit guidance
here. We'll need some kind of a registry that allows custom function
factories. We also need to support variable declaration. So, it doesn't
seem far-fetched to plug in other types of factories as well. The parser
can let registered extension factories attempt to instantiate expression
parts from unknown tokens. If no one claims a token, a parse error
results. This is just off the top of my head, mind you. But some behavior
like this seems necessary.
>>I don't understand yet how an XPath expression can point to something
>>"not neatly aligned on DOM Node boundaries".
>
>For instance, in DOM you can have consecutive text nodes. This usually
>happens as a result of XML parsers returning text in several chunks. XPath
>has no notion of that, if it selects the text() inside an element, I guess
>that what is expected is that it returns a single text node.
>
>Chances are, that text node isn't in the DOM (though its content is). For
>people that want to use XPath to navigate (and edit) DOM documents, that
>can be a serious caveat, because in effect the returned text node has
>serious chances of being read only (well, you can write to it, but it
>won't modify the tree).
Dumb idea: for the DOM helper, we can provide two calls: one that returns
read-only data according to the XPath data model and another that returns
live DOM nodes. The XPath data model processing is, necessarily, a wrapper
around the DOM level stuff. So, all this means is that you expose the
intermediate results.
I don't think any of this matters for the SAX Helper as the data just gets
handed to the app based on registered handlers (or implemented overrides).
>However, for the sake of memory consumption (as well as other
>considerations) one could return a NodeIterator or a TreeWalker as defined
>in DOM2 (http://www.w3.org/TR/DOM-Level-2-Traversal-Range/traversal.html).
Looks right.
>I'm not personally interested in an XML syntax for XPath...
The more I think about it, with decent expression part traversal and
factories, the XML emit/parse should be split into a separate concrete
class. I continue to think it is low priority. Finally, once things are
working, a decent XML representation should be apparent. I'm not getting a
clear picture yet.
>>5) Need XPointer Support
>>
>Yes I think XPointer doesn't need to be built in, but it should be kept in
>mind that the XPath API must play nice with other such needs.
>
>Which reminds me, the API needs a way to add functions (if only for XSLT).
>I think part of the recent discussion about xs:script and notably the
>xbind proposal would be useful in this context (it would be useful to
>be able to add functions easily, and to declare their signatures).
If we define a decent registry interface. It should be possible to read in
xbind data (deployment descriptor?) and register the declared functions.