OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] XQuery and DTD/Schema?

[ Lists Home | Date Index | Thread Index ]

At 04:20 PM 7/3/2002 -0700, Tim Bray wrote:
>Jonathan Robie wrote:
>
>>Let me try to explain why I think named typing is good. Here's a function:
>>define function get-total( element invoice $i )
>>   returns xs:decimal
>>{
>>         sum( $i//item/price )
>>}
>>This function assumes that the invoices it takes have been validated as 
>>invoice elements according to some schema
>
>Wrong.  This function assumes that <price> has some numeric type and can 
>meaningfully be summed.

And it can only assume this if the data model instance contains information 
that tells me the type of price - this is why type annotations are part of 
the XQuery data model. They tell the query processor which schema 
definitions were used for validation.

Without this information, I can't safely apply the above function to XML 
instances. For instance, the above function would not work correctly if 
someone had a variant on the invoice schema that used the name 'sale-price' 
instead of 'price', because it would not add such elements to the total. It 
also relies on the data type, which must be xs:decimal.

This goes back to the basic notion of validation as a contract between the 
producer and consumer of data, extending the concept with datatypes, and 
with the notion of type annotations.

>The type has presumably been identified to the xquery engine using XSchema 
>vocabulary, presumably as xs:decimal.  The presumption that a schema 
>validation operation has actually taken place is without evidence - in a 
>large proportion of cases the data has probably been generated 
>programmatically and flowed straight into a database, no angle-brackets in 
>evidence anywhere.

You don't need to do schema validation, all you need to do is create an 
instance of the XML Query data model - and this need not be done 
physically. For instance, many people are working on XQuery mappings to 
relational data, where the only physical realization of the type 
information is in the relational data dictionary, and where the actual 
processing is often done in SQL, without anything remotely resembling XML 
Schema processing ever occuring in any physical sense.

>I'm really feeling uneasy - a lot of people whom I consider to be smart 
>seem to be participating consensually in the belief that data types are 
>organically tied to the validation process, which to me seems empirically 
>just nutty.

What we provide is a typed data model. We define how to map from the PSVI 
to instances of the data model, because we have to do this for XML. We 
don't define how to do this for relational data, but the ISO SQL/XML 
committee is defining the XML Schema equivalents for relational data. Other 
mappings will be done for various data sources by various parties.

>>At run-time, you don't want to have to test every function parameter to 
>>see if it corresponds to a schema, you simply want to ensure that the 
>>validator has said this corresponds to the appropriate definition.
>
>It depends; if the data being queried is actually XML, when you encounter 
>the string of characters that ostensibly represent <price> you're going to 
>have to convert them to a number to do arithmetic, and if you don't have 
>exception handling logic surrounding this process you're just being lazy 
>and stupid - so it's not clear that you ever escape the process of 
>"validation".

But the information need not be validated during query processing if it has 
already been validated and stored in some database - this is important to 
get any kind of efficiency. And it need not be validated if it is known to 
correspond to some data dictionary or set of class definitions. Named 
typing is basically about the notion that some information is already known 
by the system to conform to a particular definition.

>  If on the other hand this is actually something that is known to be an 
> integer and thus stored in a C or Java "int", you couldn't test it 
> against a schema anyhow because it's no longer XML.  So arguments 
> claiming that static typing is good because  it bypasses runtime 
> validation are basically without merit.

Ah, but one of the big reasons for named typing is the notion of creating 
views, particularly of relational data, but also for objects and other 
typed data sources.

>And named types do seem awfully convenient, so I'm really not disagreeing 
>with Jonathan's main point at all. -Tim

Yes, they really are convenient...

Jonathan






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS