OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: (Correction) Re: Are we losing out because of grammars?



(James' comments reordered)

From: James Clark <jjc@jclark.com>

> Grammars are better that rule-based systems for some things. Rule-based
> systems are better than grammars for other things.  If something can be
> expressed simply using a grammar, it's probably a good idea to use a
> grammar, because, amongst other reasons, it can be implemented very
> efficiently.

If the document is already parsed into a DOM, then there is no necessary
efficiency gained from a streamable method.

I specifically added a feature, phases, in Schematron 1.5 which addresses
this issue. The user can select (e.g. on the command line) a particular set
of patterns they want to test.  These can be highly targetted and so be more
efficient than any method that requires a single pass over the whole
document.

> If it can't be expressed simply using a grammar, then use a rule-based >
system.

But if the paradigms are both good at different things and if they both are
efficient in different circumstances then why recommend grammar *first* and
relegate rules for the fiddly bits?  Why not the other way around?

At the moment I don't see that they are hammers and saws with no overlap in
functionality.

>You've just proved my point.  Your solution doesn't work. Just as
>x[position()=1] selects the first x element, so x[position()=last()]
>selects the last x element; it does not test whether the x element is
>the last child.  Some very simple grammrs are awkward and error-prone to
>express using path-based rules (the converse is also true).

A mistake in a later-night off-the-cuff and untested example to a voluntary
newsgoup proves nothing, except my periodic incompetence which is no secret.
Fortunately I am not the only schema-language-devisor who has made a mistake
in a public example recently, so I cannot feel embarrassed :-) See the P.S.
for hopefully better versions.

In a previous post, James commented

> I really find it very hard to take
> seriously the idea that the time has come to completely discard grammars
> in favour of path-based rule systems.

Good. I don't think I am demanding that grammars be banned!  In this thread
"Are we losing out because of grammars?"  I  started by commenting
on Kawaguchi-san's ideas  "Of course, I don't mean...it is not important
that we have strong alternative schema languages..." (including
grammar-based ones.)

The point was not "completely discard grammars" but "Might we get to
higher-level schema languages faster"  and "What if, even after figuring out
how to handle ambiguity and unions in grammars, we are still left with a
paradigm that is not expressive enough for implementing the human-oriented,
concept-modeling/data-modeling systems that some people think are
important?"

There was a comment on a list somewhere that the US economy would stall
because of the late deliver of XML Schemas.  Without going into the merits
of that idea, if schemas are important enought for someone to suggest that
view,  then it is responsible and reasonable to discuss the adequacy and
characteristics of the underlying approach (grammars) and to have
alternatives available (whether as supplements, alternatives or
replacements).  Where has this issue been discussed?

Why are there only hammers in our shops?

Cheers
Rick Jelliffe

P.S.

> The complexity is all in the XPath expression.

My example addressed James' complaint that his rule-based attempt looked
inefficient.  Braving typos and mistakes, here are some tests that would be
more efficient (depending on the implementation)
            *[1][self::b][next-sibling::c[not(next-sibling::*)]]  or
            *[1][self::c][not(next-sibling::*)]
  or perhaps even
            *[1][self::b][next-sibling::c[not(next-sibling::*[1])]]  or
            *[1][self::c][not(next-sibling::*[1]
  or perhaps even
            (*[1][self::b] and *[2][self::b] and not(*[3]))
        or (*[1][self::c] and not(*[2]))