OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Are we losing out because of grammars?

From: Bullard, Claude L (Len) <clbullar@ingr.com>

>Ummm... so far it looks like they have about
>the same expressive power.  Can you show
>examples where they don't?  Appreciated...

Off the top of my head,  grammars like
   ( a, (a |b)*, c, (a, b)+, b )+
are pretty hard.  One can easily infer the rules

  - there can only be a, b, c
        test="count(*) = count(a)+count(b)+count(c)"
  - there must be at least 1 a, at least 2 b, and at least 2 c
       test="a"  test="count(b) > 1" test="count(c) >1"
  - it must start with a and end with 2b
       test="*[1][self::a]"   ???
  - a must follow c
       test="count(a[next-sibling::c]) = count(a)"

but the other information in there gets cumbersome to model with paths.
I think the reason is because there must be some anchor point (element name,
position or count) to hang Xpaths from, which is not what is available with

(Actually, it should be possible to mechanically generate much more complex
to model much more.  But the chances of such complicated model corresponding
well to any cliche in an inferencing engines' database would be slight. So
we could get Xpaths, but we couldn't generate nice natural-language
explainations for them, which is the name of my game, given that notionally
in schematron the natural language assertions come first and the tests are
just there to try to model the statement as best we can.)

But my point has been to look hard at what we are missing out on expressing:
a detail that should have no part in fine schema design. Which is why I say
"who cares" about grammars (in particular, grammars once they lose the
virtue of terseness.)

Rick Jelliffe

P.S. The kind of horrible rule to avoid would be

        or ( self::a[next-sibling::*[1][self::b][next-sibling::*[1][self::b
        or  ...
but this would capture the content model for small models. The unbounded
cannot be captured with this, though, it has to stop somewhere, but
simulation is possible
and can be mechanically derived from a grammar.