xml-dev - Re: [xml-dev] Relax NG, Thoughts, etc.

Re: [xml-dev] Relax NG, Thoughts, etc.

[ Lists Home | Date Index | Thread Index ]

To: Michael Kay <michael.h.kay@ntlworld.com>
Subject: Re: [xml-dev] Relax NG, Thoughts, etc.
From: Burak Emir <Burak.Emir@epfl.ch>
Date: Mon, 29 Nov 2004 12:21:14 +0100
Cc: xml-dev@lists.xml.org
In-reply-to: <20041129101548.EVEL2076.aamta04-winn.mailhost.ntl.com@Turtle>
Organization: EPFL
References: <20041129101548.EVEL2076.aamta04-winn.mailhost.ntl.com@Turtle>
User-agent: Mozilla Thunderbird 0.8 (X11/20040913)

Michael Kay wrote:

>>Granted, dynamic validation (=at runtime) is always possible, 
>>but it is 
>>certainly not a big help. Errors are discovered after having gone to 
>>production, doesn't sound good to me.
>>    
>>
>
>Actually, it's an enormous help compared with not doing it at all.
>
>I agree that static checking, where possible, is even better. Saxon doesn't
>do any static checking against a schema yet. 
>
>  
>
Not even the payware-version?

>>I am sure you will not try to add exact static typing based 
>>on Relax NG for XSLT in the next version of Saxon? 
>>    
>>
>
>No, but I wouldn't rule it out eventually.
>
>  
>
>>There are papers that analyze 
>>this for subsets of XSLT
>>    
>>
>
>Can you give me a reference? I'm not aware of such papers.
>
>  
>
A recent survey (not XSLT specific) that comes to my mind is "The Design 
Space of Type Checkers for XML Transformation Languages" Anders Møller 
and Michael I. Schwartzbach.

*Then there is "Towards static type checking for XSLT" Akihiko Tozawa, 
which refers to the original work by Milo, Suciu and Vianu.

That one...
 T. Milo, D. Suciu, and V. Vianu. Type-checking for XML
  transformers. In Proceedings of the 19th ACM SIGACT-
  SIGMOD-SIGART Symposium on Principles of Database
  Systems, pages 1122, 2000.

should contain the proof of undecidability.

Most of these say **"type" and mean **regular tree languages. I am not 
aware of many papers on XML Schema.
*

>>but doing it exactly, for the full 
>>language, 
>>is - as far as I remember - undecidable.
>>    
>>
>
>Even with XML Schema, it's unprovable that a given stylesheet is guaranteed
>to create valid output (to do strict static typing, in the style of the
>XQuery static typing option, we would have to throw out
>xsl:apply-templates). 
>
Not entirely sure whether recursive apply is the central problem: One 
might be able to infer what type of the input is from the type of the 
<xsl:template match="..."> and the input schema.

However, simple conditionals "if(condition) returnThis else returnThat" 
will give the same limitations to type-checking as known from other 
Turing complete languages.

Since PSVI infosets will always be in a subclass of the regular tree 
languages (i.e. top-down deterministic regular tree language), I believe 
it should be slightly easier to type-check against schemata than to 
type-check against Relax NG. Of course, the treatment of substitution 
groups might balance this out again...

>But pragmatically, I think it's possible to give
>static warnings for a good proportion of user errors. And I suspect that
>most of the heuristics to do this are equally applicable to DTDs or Relax
>NG.
>
>  
>
The mentioned papers talk about "backwards type inference" and also 
about approximating types conservatively(which might reject well-typed 
stylesheets).  The user then has to annotate (like with casts) 
stylesheet fragments with type information when she is sure that her 
stylesheet works out, and willing to "disable" type-checking in this way.

Hope this helps,
Burak

-- 
Burak Emir

http://lamp.epfl.ch/~buraq

Follow-Ups:
- RE: [xml-dev] Relax NG, Thoughts, etc.
  - From: "Michael Kay" <michael.h.kay@ntlworld.com>

References:
- RE: [xml-dev] Relax NG, Thoughts, etc.
  - From: "Michael Kay" <michael.h.kay@ntlworld.com>

Prev by Date: Re: [xml-dev] Is an XQuery Document an XML Document?
Next by Date: RE: [xml-dev] Relax NG, Thoughts, etc.
Previous by thread: RE: [xml-dev] Relax NG, Thoughts, etc.
Next by thread: RE: [xml-dev] Relax NG, Thoughts, etc.
Index(es):
- Date
- Thread