OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] XQuery and DTD/Schema?

[ Lists Home | Date Index | Thread Index ]

At 02:47 PM 7/3/2002 -0600, Uche Ogbuji wrote:
> > Let me try to explain why I think named typing is good. Here's a function:
> >
> > define function get-total( element invoice $i )
> >    returns xs:decimal
> > {
> >          sum( $i//item/price )
> > }
> >
> > This function assumes that the invoices it takes have been validated as
> > invoice elements according to some schema, and are not merely well-formed
> > elements whose name happens to be 'invoice'. At run-time, you don't 
> want to
> > have to test every function parameter to see if it corresponds to a 
> schema,
> > you simply want to ensure that the validator has said this corresponds to
> > the appropriate definition.
>
>You keep saying this sort of thing, and it baffles me.  Its a solution in
>search of a problem if I ever saw one.  If the substrate data is *XML* (this
>is what we're here to discuss, right?) then why do these declaration have 
>to be
>
>a) static

Static typing is not required in XQuery. If you choose to use static 
typing, it can guarantee that the above function will only be applied to 
invoices that have been validated according to the schema used for static 
typing. If you do not choose to use static typing, then dynamic typing is 
used.

Here are some quotes from the XQuery spec:

         At user option, static typing can be disabled. A query that
         passes type checking will return the same result regardless of
         whether type checking is enabled or disabled. [Ed. Note: See
         Issue 41 for a further discussion of static vs. dynamic
         semantics.]

         XQuery is likely to have multiple conformance levels. There
         may be a conformance level that does not include static type
         checking. There may be a conformance level that does not
         support Schema import, so that only built-in types and node
         types may be used in declarations. [Ed. Note: See Issue 42 for
         a further discussion of conformance levels.]

>b) based on a named type system

It has to be based on some type system that satisfies some constraints:

1. Supports optional static analysis
2. Does not require revalidation at runtime
3. Supports views of non-XML data without forcing a physical mapping into XML
4. Reasonably simple and straightforward to implement in many environments

Named typing seems to meet these criteria.

>c) a and/or b only as provided within a schema

This is not true - any process can create an instance of the data 
model.  We take responsibility for showing how to do this with DTDs and 
merely well formed documents, as well as for XML Schema. I don't think we 
have the time or the responsibility to do this for every possible data source.

>a) There is no reason it has to be static.  The semantics of sum can be
>"accept or convert to an integer".  This is how XPath 1.0 does things.  And
>I'll make the technical point that you know very well Simon has made enough
>times: this is the way to do it that is more cleanly layered for XML
>processing.  I know you'll respond "but that's inefficient".

Efficiency matters, but type safety is at least as important. I don't think 
that type safety is something that should be left to the programmer.

>This is an
>*implementation* matter, and not a matter that should impose itself in the
>formal semantics.  There are many possible strategies for dynamic
>optimization.  And if these do not prove sufficient, static constraints 
>can be
>*added on*.

You really do have to design a type system for static analysis if you want 
it to work well in this way. A type system is not something you can add on 
in a casual manner.

>b) There is no reason it has to be named types.  The constraints that are 
>used
>to improve the design or implementation of the function, whether static or
>dynamic can be "ad-hoc", declared in some constraint language (REGEXen are 
>one
>example, RELAX NG is another) and made to be first-class constructs in the
>language, i.e. they can themselves be passed to functions and such.

My reasons for named typing are listed above. I don't think it would be 
particularly hard to extend RELAX-NG with a way to create type annotations.

>c) Even if one were to say "XQuery requires static, named types" (I 
>personally
>could readily accept named constraints, but static types are beyond the pale
>of evil to me)

XQuery permits, but does not require static typing (see the quotes above). 
Static typing is quite useful in environments that require type safety, and 
it can also be much more efficient. I find it extremely useful in mapping 
views of typed information.

>then the question remains, why do these *have* to be provided
>in a schema?  Why not divorce XQuery's typing semantics from the underlying
>parsing and pre-query processing?  Why not just say "the XQuery 
>implementation
>must provide so and so static type information and so and so type inferencing
>functions" and leave it at that?  Why mandate that these come from the PSVI?

This is precisely what we have done, in fact. We do provide the PSVI 
mappings to the data model. After that, everything is defined in terms of 
our type declarations and the data model representation. To support a 
different system, you have to define your own mappings.

Jonathan






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS