[
Lists Home |
Date Index |
Thread Index
]
After reading Henry's posting below, I realized
that the following problem is also NP-hard:
Given an arbitrary RELAX NG schema, which is
allowed to contain XML Schema Datatypes, is
there an XML document that conforms to the
schema?
Here's the proof:
The 3SAT problem is NP-complete. Any 3SAT problem
can be encoded into a RELAX NG schema, such that
an XML document contains a solution to the 3SAT
problem if and only if the document conforms to
the RELAX NG schema. Thus, solving the XML
generation problem (for RELAX NG with XML Schema
Datatypes) is at least as hard as solving 3SAT.
For example, consider the following 3SAT problem:
( x1 OR x2 OR x3 ) AND
( x2 OR NOT(x3) OR NOT(x4) ) AND
( x3 OR x4 OR NOT(x1) )
This 3SAT problem can be encoded as the following
RELAX NG schema:
<?xml version="1.0" encoding="UTF-8"?>
<element name="three_sat_solution"
xmlns="http://relaxng.org/ns/structure/1.0"
datatypeLibrary=
"http://www.w3.org/2001/XMLSchema-datatypes">
<data type="string">
<param name="pattern">x1=[01],x2=[01],x3=[01],x4=[01]</param>
<param name="pattern">(.*x1=1.*)|(.*x2=1.*)|(.*x3=1.*)</param>
<param name="pattern">(.*x2=1.*)|(.*x3=0.*)|(.*x4=0.*)</param>
<param name="pattern">(.*x3=1.*)|(.*x4=1.*)|(.*x1=0.*)</param>
</data>
</element>
The following XML document conforms to this RELAX NG schema
and solves the 3SAT problem:
<?xml version="1.0" encoding="UTF-8"?>
<three_sat_solution>x1=1,x2=0,x3=0,x4=1</three_sat_solution>
The following XML document does not conform to the RELAX NG schema
and does not solve the 3SAT problem:
<?xml version="1.0" encoding="UTF-8"?>
<three_sat_solution>x1=1,x2=1,x3=0,x4=0</three_sat_solution>
For more information on 3SAT (a.k.a., 3-CNF), see
http://www.wikipedia.org/wiki/Boolean_satisfiability_problem.
Since RELAX NG has a solid mathematical foundation, I suspect that the
XML generation problem can be solved in linear time when the problem is
restricted to RELAX NG schemas that do not use W3C XML Schema Datatypes.
Best regards,
Bob
<sig name = 'Bob Lyons'
title = 'XML and Java Consultant'
company = 'Unidex, Inc.'
phone = '+1-732-975-9877'
email = 'boblyons@unidex.com'
url = 'http://www.unidex.com/'
product = 'XML Convert: transforms flat files to XML and vice versa' />
-----Original Message-----
From: Henry S. Thompson [mailto:ht@cogsci.ed.ac.uk]
Sent: Friday, March 28, 2003 4:59 PM
To: xml-dev@lists.xml.org
Subject: ID/IDREF makes XML generation NP-hard
Somewhat surprisingly, it turns out that answering the question, for an
arbitrary XML DTD, "Are there any valid instances of the document type
defined by this DTD?", is an NP-hard problem.
This is true, because it is possible to encode a 3SAT problem [1] in a
DTD, so that there is an instance of the DTD's document type iff the
problem can be satisfied. So finding a valid instance is as hard a
problem as solving the 3SAT problem, and 3SAT is known to be NP-hard.
[snip]
|