[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Blowout

From: Joe English <jenglish@flightlab.com>
To: xml-dev@lists.xml.org
Date: Sat, 03 Mar 2001 10:05:30 -0800

Henry S. Thompson wrote:

> I'm curious as to the collective wisdom wrt the usual (in
> computational linguistics) way of checking the worst-case complexity
> of grammar formalisms:  can you encode 3-SAT in TREX, RELAX,
> Schematron?
>
> If you can, worst-case complexity is exponential.

TREX and RELAX have a decision procedure ("does document D
conform to schema S") that is O(N) in the size of the input
and worst-case polynomial in the size of the schema.
I can't think of anything in Schematron that's worse than
polynomial time - it mostly boils down to evaluating a bunch
of XPath expressions, and there's no recursion.

> 3-SAT is: given an expression in conjunctive normal form, where every
> conjunct contains exactly 3 terms consisting of a possibly negated
> variable, is there a binding of all the variables rendering the whole
> true.
>
> For example:
>
>   (a v b v ~c) ^  (~b v d v ~e)  ^ (~a v c v e) ^ (~c v ~d v ~e)
>   a=1 b=0 c=1 d=0 e=1
> satisfies this.
>
> Turn this into a language either explicitly (i.e. with an attribute
> for negation) or implicitly (use absence for negation) and then try to
> write a (T|R|S) schema.  If you can, assertions about parsers are
> beside the point, worst-case complexity is exponential in (n,m) where
> is the number of terms and m is the number of variables.

I'm not sure what this would prove.  First, the encoding procedure
E : "3-SAT instance" -> "Schema" would have to take polynomial
time and space for this to mean much.  Also, checking a document
instance against the schema would only answer the (much easier) question
"does this particular set of bindings satisfy the formula", not
"does _any_ set of bindings satisfy the formula".

For TREX and RELAX, the latter question does have a decision
procedure (not sure about Schematron).  For RELAX, the decision
procedure "does any document match this schema" is O(M) in
the size of the schema.  (In fact, it's probably O(1) -- I
believe the answer is always "yes" in the case of RELAX.)

The satisfiability algorithm for TREX *is* worst-case
exponential, due to <concur>, <interleave>, and <notAllowed>;
however this doesn't say anthing about the time complexity of
the validation algorithm (which is what TREX is _usually_
used for).

--Joe English

  jenglish@flightlab.com

Follow-Ups:
- Re: Blowout
  - From: Kohsuke KAWAGUCHI <k-kawa@bigfoot.com>

References:
- Re: Blowout
  - From: ht@cogsci.ed.ac.uk (Henry S. Thompson)

Prev by Date: Re: About Infosets
Next by Date: RDDL & WSDL
Previous by thread: Re: Blowout
Next by thread: Re: Blowout
Index(es):
- Date
- Thread