OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   ID/IDREF makes XML generation NP-hard

[ Lists Home | Date Index | Thread Index ]
  • To: xml-dev@lists.xml.org
  • Subject: ID/IDREF makes XML generation NP-hard
  • From: ht@cogsci.ed.ac.uk (Henry S. Thompson)
  • Date: 28 Mar 2003 21:59:11 +0000
  • Original-sender: ht@inf.ed.ac.uk
  • Sender: ht@inf.ed.ac.uk
  • User-agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Portable Code)

Somewhat surprisingly, it turns out that answering the question, for an
arbitrary XML DTD, "Are there any valid instances of the document type
defined by this DTD?", is an NP-hard problem.

This is true, because it is possible to encode a 3SAT problem [1] in a
DTD, so that there is an instance of the DTD's document type iff the
problem can be satisfied.  So finding a valid instance is as hard a
problem as solving the 3SAT problem, and 3SAT is known to be NP-hard.

Here's the DTD for a 4-variable, 3-clause example.  The 'clauses'
element is the encoding of the 3SAT problem

   (v1 | -v2 | v3) & (-v1 | v3 | v4 ) & (v2 | -v3 | -v4)

The complexity arises in finding a set of values for the 'value' IDs
of the children of <bindings> which are consistent with some sequence of
choices from the <clauses>.  

The choices interspersed among the value bindings ensure that every
variable gets exactly one binding, which in turn makes it impossible
to choose <v2n> for one clause and <v2y> for the other, without
causing an IDREF failure.

<!ELEMENT root (bindings, clauses)>

<!ELEMENT bindings (v1, (v1y|v1n) , v2, (v2y|v2n),
                    v3, (v3y|v3n) , v4, (v4y|v4n))>

<!ELEMENT clauses (( v1y | v2n | v3y ) ,
                   ( v1n | v3y | v4y ) ,
                   ( v3n | v4n | v2y ))>

<!ELEMENT v1 EMPTY>
<!ATTLIST v1 value ID #REQUIRED>
<!ELEMENT v2 EMPTY>
<!ATTLIST v2 value ID #REQUIRED>
<!ELEMENT v3 EMPTY>
<!ATTLIST v3 value ID #REQUIRED>
<!ELEMENT v4 EMPTY>
<!ATTLIST v4 value ID #REQUIRED>

<!ELEMENT v1y EMPTY>
<!ATTLIST v1y value IDREF #FIXED "v1true">
<!ELEMENT v1n EMPTY>
<!ATTLIST v1n value IDREF #FIXED "v1false">
<!ELEMENT v2y EMPTY>
<!ATTLIST v2y value IDREF #FIXED "v2true">
<!ELEMENT v2n EMPTY>
<!ATTLIST v2n value IDREF #FIXED "v2false">
<!ELEMENT v3y EMPTY>
<!ATTLIST v3y value IDREF #FIXED "v3true">
<!ELEMENT v3n EMPTY>
<!ATTLIST v3n value IDREF #FIXED "v3false">
<!ELEMENT v4y EMPTY>
<!ATTLIST v4y value IDREF #FIXED "v4true">
<!ELEMENT v4n EMPTY>
<!ATTLIST v4n value IDREF #FIXED "v4false">


There is at least one solution:

<!DOCTYPE root SYSTEM "3sat.dtd">
<root>
 <bindings>
  <v1 value="v1false"/><v1n/>
  <v2 value="v2false"/><v2n/>
  <v3 value="v3false"/><v3n/>
  <v4 value="v4false"/><v4n/>
 </bindings>
 <clauses>
  <v2n/>
  <v1n/>
  <v3n/>
 </clauses>
</root>

Henry S. Thompson
Richard Tobin

[1] http://www.wikipedia.org/siki/Satisfiability_problem
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                      Half-time member of W3C Team
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/
 [mail really from me _always_ has this .sig -- mail without it is forged spam]




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS