OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Are we losing out because of grammars?




Rick Jelliffe wrote:

> P.S. If anyone knows any good website on term rewriting that list rules for
> regular expressions -> paths of some kind, I would be very interested.

I don't know if this is quite what you're looking for,
but if you interpret XPath axes, name tests, and node tests 
as relations and add a relational transitive closure operation,
you end up with a very nice query/pattern matching framework.

For example, if 'x' and 'y' are nodes, then the relational
expression 'x (child) y' holds iff y is a child of x.
The infix '/' operator is the same as relational composition:
x (axis1/axis2) y iff there exists z such that x (axis1) z and
z (axis2) y.

Name tests and node tests are "coreflexive" relations;
for example 'x (element) y' holds iff x = y and x is an
element node.  Then the '::' in 'axis::node-test' can
also be interpreted as relational composition.  Other
XPath predicates -- as long as they don't involve
'position()' or 'last()' -- can be dealt with the same way.

It helps to add a few "single-step" axes 'x (right) y' ===
'y (left) x' === y is the next sibling of x, and
'x (down) y' === 'y (up) x' === y is the first child of x.
('right' and 'down' are the same as 'following-sibling::*[1]'
and 'child::*[1]', but it helps to add them as primitives
since the implied call to 'position()' in the predicate '[1]'
is a real bastard to work with in a relational framework.)

XML DTD content models map directly onto this framework:
content particles are predicates (coreflexives), "|" is relational
union, "x,y" becomes "x/right/y", "x*" is relational transitive
closure, and "x?" becomes "(x | id)" where "id" is the
identity relation.

It's not hard to add an "interleave" operator a la TREX;
this gives you unordered content models.  The TREX "concur"
operator is just relational intersection; this is easy
to deal with mathematically (but can be more difficult to
implement).  And, since you also have the other axes to
work with, this significantly extends the expressive power 
of content models.

I've built a toy XML processing library in Tcl (similar to Cost)
based on this framework, and it works pretty well.


--Joe English

  jenglish@flightlab.com