[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*From*:**Joe English <jenglish@flightlab.com>***To*: xml-dev@lists.xml.org*Date*: Sat, 03 Mar 2001 10:05:30 -0800

Henry S. Thompson wrote: > I'm curious as to the collective wisdom wrt the usual (in > computational linguistics) way of checking the worst-case complexity > of grammar formalisms: can you encode 3-SAT in TREX, RELAX, > Schematron? > > If you can, worst-case complexity is exponential. TREX and RELAX have a decision procedure ("does document D conform to schema S") that is O(N) in the size of the input and worst-case polynomial in the size of the schema. I can't think of anything in Schematron that's worse than polynomial time - it mostly boils down to evaluating a bunch of XPath expressions, and there's no recursion. > 3-SAT is: given an expression in conjunctive normal form, where every > conjunct contains exactly 3 terms consisting of a possibly negated > variable, is there a binding of all the variables rendering the whole > true. > > For example: > > (a v b v ~c) ^ (~b v d v ~e) ^ (~a v c v e) ^ (~c v ~d v ~e) > a=1 b=0 c=1 d=0 e=1 > satisfies this. > > Turn this into a language either explicitly (i.e. with an attribute > for negation) or implicitly (use absence for negation) and then try to > write a (T|R|S) schema. If you can, assertions about parsers are > beside the point, worst-case complexity is exponential in (n,m) where > is the number of terms and m is the number of variables. I'm not sure what this would prove. First, the encoding procedure E : "3-SAT instance" -> "Schema" would have to take polynomial time and space for this to mean much. Also, checking a document instance against the schema would only answer the (much easier) question "does this particular set of bindings satisfy the formula", not "does _any_ set of bindings satisfy the formula". For TREX and RELAX, the latter question does have a decision procedure (not sure about Schematron). For RELAX, the decision procedure "does any document match this schema" is O(M) in the size of the schema. (In fact, it's probably O(1) -- I believe the answer is always "yes" in the case of RELAX.) The satisfiability algorithm for TREX *is* worst-case exponential, due to <concur>, <interleave>, and <notAllowed>; however this doesn't say anthing about the time complexity of the validation algorithm (which is what TREX is _usually_ used for). --Joe English jenglish@flightlab.com

**Follow-Ups**:**Re: Blowout***From:*Kohsuke KAWAGUCHI <k-kawa@bigfoot.com>

**References**:**Re: Blowout***From:*ht@cogsci.ed.ac.uk (Henry S. Thompson)

- Prev by Date:
**Re: About Infosets** - Next by Date:
**RDDL & WSDL** - Previous by thread:
**Re: Blowout** - Next by thread:
**Re: Blowout** - Index(es):