OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: We need an XPath API



Thanks for the detailed response.  Comments below.

take it easy,
Charles Reitzel

P.S. Just one copy now.  I switched from the digest subscription that makes 
occasional direct receipt convenient.  But I don't want to spam anyone.


At 09:33 PM 3/5/01 +0100, Robin Berjon wrote:
 >First and foremost, Charles, thank you very much for your summary, I
 >believe it to be very useful.

Good to hear.  Thanks for the encouragement.

 >...
 >Note though that I am *not* interest in low-level callbacks but
 >rather in a factory interface so that I could build custom XPath
 >objects.

Thanks for clearing that up.  Then we are agreed: no callbacks are 
necessary for XPath expression parsing.  I like expression factories well 
enough.  In particular, they can let an application build a generic 
expression and emit either the full or abbreviated syntax.


 >There are several things that an XPath interface could provide.
 >Mostly 1) a way to look into and manipulate an XPath expression
 >for various purposes and 2) a way to query a DOM relative to a
 >given XPath.
 >
 >Two interesting methods I can think of for 2) are
 >$xpath->select_nodes($current_node) and $xpath->matches($node)
 >(I'm using Perl style syntax for my examples, but I think it
 >should be understandable).  The above would presumably work on
 >any DOM Node. However, I'd like to see a factory interface in
 >order to be able to create XPath object optimized for different
 >uses, for different tree models (potentially non-DOM), etc...

I understand the use case.  I agree the expression needs to be separate 
from the DOM-specific handling.  Thus, the notion of a XPathDOMHelper 
interface that will use XPathExpr and DOM Document/Element instances to 
locate nodes and/or test for a match.

This division of labor allows the expression level interface to be reused 
in an XPathSAXHelper without imposing DOM dependencies.

 >... Using XPath for CSS selectors was brought up on www-style
 >and rejected, for (imho) good reasons. The best reason is obviously
 >that it'd break compatibility accross CSS versions,
 >but that's of little interest here.   ...
 >
 >The extent of the overlap that I believe would be useful to
 >investigate in SAC is it's nice Selector/Condition model. That can
 >imho be directly applied to XPath as Path/Predicate.

Agree.

 >I'd go against using the interfaces directly. I started drafting
 >an XPath mapping of SAC yesterday just to see how far it goes
 >and a  number of things came out differently.

You've convinced me.

In any case, the existence of a decent XPath expression interface would be 
useful in building an XPath <-> CSS2 translator.  But we needn't include it 
in our current scope.

 >I'd like to be able to construct a selector object totally
 >abstracted from the selector syntax. I can do that with
 >SAC's factories and I'd like to be able to do the same with
 >the XPath interface, which is why I'm interested
 >in factories.

I wouldn't oversell expression factories.  They can help to bridge syntax 
variations.  E.g. abbreviated vs. full XPath.  You are still dependent on 
the underlying features.  A successful example of this "pattern" is Rogue 
Wave's db.h++, which has interfaces for generating SQL expressions and 
concrete implementations for each of the major SQL vendors.  It works well.

I think we need to support non-factory based construction as well:

  XPathParserFactor xpFact = new JoeBobsXPathFactoree();

  a) Parse Text

  XPathExpr xpExpr = xpFact.createExpression( "//para" );

  b) Selector Factory

  XPathExpr xpExpr = xpFact.createExpression();
  xpExpr.append( xpFact.createDescendantOrSelfSelector() );
  xpExpr.append( xpFact.createElementSelector( "para" ) );


 >Also, I think a factory approach would make sense
 >considering that XPath is moving towards being able to
 >use the PSVI. Perhaps other people would also like to see
 >XPath have the capacity to select against all sorts of other
 >infosets. To do so using factories, they would only need
 >to define new XPath tokens that are grammatically valid,
 >a few factory classes, and voila! an extended XPath that
 >can deal specifically with their infoset. I believe that
 >would make XPath interestingly extensible.

Yes, we need to strategize how to make an extensible base.  The approach 
you describe works well enough for *generating* a text representation.  But 
it doesn't help, as is, for parsing an existing text representation.

I think the XPath tokenization rules give all-but-explicit guidance 
here.  We'll need some kind of a registry that allows custom function 
factories.  We also need to support variable declaration.  So, it doesn't 
seem far-fetched to plug in other types of factories as well.  The parser 
can let registered extension factories attempt to instantiate expression 
parts from unknown tokens.  If no one claims a token, a parse error 
results.  This is just off the top of my head, mind you.  But some behavior 
like this seems necessary.



 >>I don't understand yet how an XPath expression can point to something
 >>"not neatly aligned on DOM Node boundaries".
 >
 >For instance, in DOM you can have consecutive text nodes. This usually
 >happens as a result of XML parsers returning text in several chunks. XPath
 >has no notion of that, if it selects the text() inside an element, I guess
 >that what is expected is that it returns a single text node.
 >
 >Chances are, that text node isn't in the DOM (though its content is). For
 >people that want to use XPath to navigate (and edit) DOM documents, that
 >can be a serious caveat, because in effect the returned text node has
 >serious chances of being read only (well, you can write to it, but it
 >won't modify the tree).

Dumb idea: for the DOM helper, we can provide two calls: one that returns 
read-only data according to the XPath data model and another that returns 
live DOM nodes.  The XPath data model processing is, necessarily, a wrapper 
around the DOM level stuff.  So, all this means is that you expose the 
intermediate results.

I don't think any of this matters for the SAX Helper as the data just gets 
handed to the app based on registered handlers (or implemented overrides).

 >However, for the sake of memory consumption (as well as other
 >considerations) one could return a NodeIterator or a TreeWalker as defined
 >in DOM2 (http://www.w3.org/TR/DOM-Level-2-Traversal-Range/traversal.html).

Looks right.

 >I'm not personally interested in an XML syntax for XPath...

The more I think about it, with decent expression part traversal and 
factories, the XML emit/parse should be split into a separate concrete 
class.  I continue to think it is low priority.  Finally, once things are 
working, a decent XML representation should be apparent.  I'm not getting a 
clear picture yet.


 >>5) Need XPointer Support
 >>
 >Yes I think XPointer doesn't need to be built in, but it should be kept in
 >mind that the XPath API must play nice with other such needs.
 >
 >Which reminds me, the API needs a way to add functions (if only for XSLT).
 >I think part of the recent discussion about xs:script and notably the
 >xbind proposal would be useful in this context (it would be useful to
 >be able to add functions easily, and to declare their signatures).

If we define a decent registry interface.  It should be possible to read in 
xbind data (deployment descriptor?) and register the declared functions.