OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Blowout




Henry S. Thompson wrote:

> I'm curious as to the collective wisdom wrt the usual (in
> computational linguistics) way of checking the worst-case complexity
> of grammar formalisms:  can you encode 3-SAT in TREX, RELAX,
> Schematron?
>
> If you can, worst-case complexity is exponential.

TREX and RELAX have a decision procedure ("does document D
conform to schema S") that is O(N) in the size of the input
and worst-case polynomial in the size of the schema.
I can't think of anything in Schematron that's worse than
polynomial time - it mostly boils down to evaluating a bunch
of XPath expressions, and there's no recursion.

> 3-SAT is: given an expression in conjunctive normal form, where every
> conjunct contains exactly 3 terms consisting of a possibly negated
> variable, is there a binding of all the variables rendering the whole
> true.
>
> For example:
>
>   (a v b v ~c) ^  (~b v d v ~e)  ^ (~a v c v e) ^ (~c v ~d v ~e)
>   a=1 b=0 c=1 d=0 e=1
> satisfies this.
>
> Turn this into a language either explicitly (i.e. with an attribute
> for negation) or implicitly (use absence for negation) and then try to
> write a (T|R|S) schema.  If you can, assertions about parsers are
> beside the point, worst-case complexity is exponential in (n,m) where
> is the number of terms and m is the number of variables.

I'm not sure what this would prove.  First, the encoding procedure
E : "3-SAT instance" -> "Schema" would have to take polynomial
time and space for this to mean much.  Also, checking a document
instance against the schema would only answer the (much easier) question
"does this particular set of bindings satisfy the formula", not
"does _any_ set of bindings satisfy the formula".

For TREX and RELAX, the latter question does have a decision
procedure (not sure about Schematron).  For RELAX, the decision
procedure "does any document match this schema" is O(M) in
the size of the schema.  (In fact, it's probably O(1) -- I
believe the answer is always "yes" in the case of RELAX.)

The satisfiability algorithm for TREX *is* worst-case
exponential, due to <concur>, <interleave>, and <notAllowed>;
however this doesn't say anthing about the time complexity of
the validation algorithm (which is what TREX is _usually_
used for).


--Joe English

  jenglish@flightlab.com