[
Lists Home |
Date Index |
Thread Index
]
i am not following this fully -- I am stuck with the previous
question.. however, consider this example also..
schema --
Doc -> doc (a, A, b)
A -> dummy (a, A?, b)
query -- doc//(a | b)
resulting exact type -- (a^n, b^n) -- this cannot be expressed as a
regular tree grammar or XML Schema whatever.
For those following this, where do you think this fits as opposed to
element construction example mentioned in Prof. Suciu's analysis, or
previously by Prof. Vianu?? can we say that the query in the example I
gave is the outlier of the tree??
note: I have not followed these results closely, but if anyone can tell us
something, that will be good.
cheers and regards - murali.
On Thu, 3 Jan 2002, Joe English wrote:
>
> Dare Obasanjo wrote:
>
> > I'm still waiting for someone with a theoretical background to debunk the
> > statements in the paper "On Database Theory and XML"[0] that it is is an
> > unsolvable problem to guarantee that one can create a query that for any
> > given XML input will generate output that conforms to a specified XML schema.
> >
> > [0] http://www.cs.washington.edu/homes/suciu/files/_F2066943700.ps
>
> That paper has an interesting example. Paraphrasing some,
> given the input DTD:
>
> <!ELEMENT root (elm*)>
>
> the query:
>
> element result {
> for $x in //elm RETURN element a { $x/text() },
> for $x in //elm RETURN element b { $x/text() },
> for $x in //elm RETURN element c { $x/text() }
> }
>
> and the output DTD:
>
> <!ELEMENT result (
> ( (a,a)*, (b,b)*, (c,c)* )
> | ( a,(a,a)*, b,(b,b)*, c,(c,c)* ) ) >
>
> (which says that the number of a's, b's, and c's all have
> the same parity, i.e., either an even number of each or an
> odd number of each), then the query will always produce a
> valid output given a valid input. However, no type system
> based on regular grammars will be able to deduce this fact.
> The best a type inferencing algorithm will be able to come
> up with given the input DTD and query is:
>
> <!ELEMENT result (a*, b*, c*)>
>
> which is a proper superset of the output DTD. The *real*
> language generated by the query is
>
> element result { a*n, b*n, c*n : n >= 0 }
>
> (where "a*n" means, roughly, <element name="a" minOccurs="n"
> maxOccurs="n"/>), which is a proper subset of the output DTD
> (so the query is type-safe) but this language is not expressible
> in a regular grammar.
>
> Basically this means that any type inferencing algorithm
> for XQuery (or an XQuery-like language) based on DTDs (or a
> DTD-like type system) cannot be complete: it will always
> reject some type-safe queries.
>
> That doesn't mean that a *sound* type inferencing algorithm
> isn't possible, i.e., one that produces no false positives
> but may answer "no", "maybe", or "I don't know" on some
> correct inputs. Personally I don't think this limitation
> is much of a problem in practice: ML and Haskell have the
> same property and their type systems are still tremendously
> useful.
>
> The astute reader will note that the example output DTD is not
> deterministic. It's not clear whether requiring deterministic
> content models helps the problem any; my guess is that it doesn't.
>
> As for Suciu's claim that "When one adds joins, typechecking
> becomes undecidable" [p. 5]: I'm not quite up to debunking it,
> but it sounds bogus to me; regular languages are closed under
> cartesian product and homomorphisms, which is basically what
> a join gives you.
>
>
> --Joe English
>
> jenglish@flightlab.com
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>
>
|