xml-dev - Re: [xml-dev] W3C's five new XQuery/Xpath2 working drafts

Re: [xml-dev] W3C's five new XQuery/Xpath2 working drafts - Still mis

[ Lists Home | Date Index | Thread Index ]

To: "Champion, Mike" <Mike.Champion@SoftwareAG-USA.com>,<xml-dev@lists.xml.org>,"Jonathan Robie" <jonathan.robie@softwareag.com>
Subject: Re: [xml-dev] W3C's five new XQuery/Xpath2 working drafts - Still missing Updates
From: "Dare Obasanjo" <kpako@yahoo.com>
Date: Thu, 27 Dec 2001 12:59:59 -0800
References: <5.1.0.14.0.20011227082828.02cef1a0@softwareag.com>

----- Original Message -----
From: "Jonathan Robie" <jonathan.robie@softwareag.com>
To: "Champion, Mike" <Mike.Champion@SoftwareAG-USA.com>;
<xml-dev@lists.xml.org>
Sent: Thursday, December 27, 2001 6:52 AM
Subject: RE: [xml-dev] W3C's five new XQuery/Xpath2 working drafts - Still
missing Updates

> At 05:20 PM 12/26/2001 -0700, Champion, Mike wrote:
> >Ahhh ... I see the argument now.  But this assumes not only a
strongly-typed
> >programming language (and the last I checked, a very substantial percentage
> >of Web programming is done in Javascript, Perl, Python, etc.), but also a
> >schema-centric conception of the role of XML in web software.
>
> XQuery is a strongly typed query language, and its type system is based on
> XML Schema. Even if you use a strongly typed programming language like
> Java, its type system is *not* based on the XML schema, and a lot of errors
> can occur when converting between the type systems. The idea is to do your
> XML to XML transformations - and eventually updates - directly in XML,
> using only one type system.
>

This is a laudable goal but it will always be hindered by some real world
considerations. It seems the W3C is adding a third layer of types to a class
of applications which may actually introduce more errors. Now we have
applications that not only have to worry about type mismatches between the
DBMS and the programming language but between the programming language, XML
and the DBMS. For instance, let's say I have purchase order information in an
RDBMS that I'd like to share with a business partner, so we agree on some XML
schema with which to exchange the data or pick a publically available one.
Then what if my business partner decides to process the XML by converting it
to a class in his target programming language (using a tool like XSD.exe that
comes with .NET), performing some operations on it and then storing it in a
NXDB? Has the inroduction and enforcement of XML types really benefitted both
parties or added an extra layer of complexity? I am unsure of what the answer
should be.

Now in some areas where XML based technologies are used from end-to-end, from
the database to the middle tier to the presentation layer then a unified type
system would be of immense benefit but I am not certain if in the general case
of how XML is used and will be used, whether the assumptions made by the W3C
on how XML usage will evolve is correct.

> I am very much afraid that I could ruin my vacation by getting too deeply
> involved in this debate, but let me try a quick stab. I reserve the right
> to duck out if it gets too much traffic ;->

:)

>
> XPath 2.0 already contains the vast majority of XQuery. The main
> differences involve element construction and strong typing.
>
> But queries also need to be able to construct instances, especially if you
> have updates. Suppose I want to add a new book to a bibliography, before an
> existing book. I would like to be able to do an update like this:
>
> update
>    let $b := document("data/xmp-data.xml")//book[title="TCP/IP Illustrated"]
>    insert
>          <book year="1997">
>              <title>Java in a Nutshell</title>
>                  <author><last>Flanagan</last><first>David</first></author>
>                  <publisher>O'Reilly</publisher>
>                  <price>29.95</price>
>          </book>
>    before $b
>
> Without element construction, we can't do that. Also, I need element
> constructors when I replace existing data:
>
> update
>    let $b := document("data/xmp-data.xml")//book[title="TCP/IP Illustrated"]
>    replace $b
>    with
>          <book year="1997">
>                <title>Java in a Nutshell</title>
>              <author><last>Flanagan</last><first>David</first></author>
>                <publisher>O'Reilly</publisher>
>                <price>29.95</price>
>          </book>
>
> So we need element constructors, which are the major dividing line between
> XQuery and XPath 2.0.
>
> Now suppose this document is governed by a schema. Should this update be
> allowed if the new element does not conform to the schema? If the schema
> specifies default attributes, should the new instance contain those
> attributes? What is the type information associated with the new instance?
>
> If this update is really modifying your mission-critical data, I think you
> probably want to ensure that updates are not creating invalid data. If you
> are just using XML as a transport, then you can use XPath 2.0 to identify
> nodes, and modify them with whatever tools you prefer.

I see two issues here, validation and type checking. The examples you list can
be handled by validation by the underlying data store being queried without
XQuery being none the wiser. Now, I am unsure of the performance issues but if
the XML repository was schema aware and maintained PSVI information then some
sort of partial validation of the XML being inserted could be done and an
error returned on an invalid insert without XQuery having to be involved.

However I consider type checking to be necessary in situations like

  update
   let $uri := document("data/xmp-data.xml")//book[title="TCP/IP
Illustrated"]/HomePageURI
  replace $uri
   with
     $uri + 57

where the language has to figure out whether to throw an error or pass a
garbage value to the underlying repository and hope it catches the error.

So it seems that the XQuery WG has decided that the query language should
handle issues with type instead leaving it to the underlying XML repository
which I admit would lead to more consistency in the behavior of applications
that used XQuery and not cause problems with moving applications from one XML
repository to another.

However my point is that with validation by the underlying data store, type
safety (dynamic) can still be somewhat guaranteed even if XQuery had updates
so it isn't necessarilly true that a strong, static type system is needed
before updates can be implemented especially since it isn't mandatory that XML
documents have schemas in the first place.

> >I'd be very interested in a reality check -- Am I the only XML developer
> >still living in the loosely-typed or non-typed Dark Ages?  Does anyone else
> >see XPath 2.0 as meeting the most pressing real-world business requirements
> >that the XQuery folks have been working on?
>
> Since XPath 2.0 is largely the same language, containing most of XQuery,
> with significant overlap among the editors and generated from the same
> grammar, I don't think we should view this as competition to XQuery. So
> your question is basically whether updates require element construction or
> a type system. I have addressed this above.

I agree that element construction is necessary for updates but I'm unconvinced
that *strong* typing is necessary, after all Perl is as weakly typed as they
come but this hasn't precluded its use as a language for use in programming
against RDBMSs which have a rather strong notion of types.

> >What percentage of real-world
> >XML programming  errors can caught by the XQuery type system?
>
> Having written quite a few queries, and several function libraries, I would
> say that a significant number of errors can be caught by the type system.
> An XML Schema defines structure and data types at roughly the same level as
> the data dictionary of a relational database. I think that most people who
> have programmed relational databases find that the errors caught by the
> database management system, based on the data dictionary, are significant.
>

The most significant errors that DBMSs usually catch in my experience have
been constraint related (foreign keys, uniqueness, non-null, etc) and not
really type related. My experience may be atypical but I've never really had
problems with type errors in DBMS programming but instead with constraints.
Thus I'd be more interested in how relationships, dependencies, triggers, etc
can be handled with the current crop of XML technologies (including those in
the pipeline)  which so far look like they are lacking. Then again, these are
problems I feel should be tackled by the database folks and not the W3C which
still seems primarily focussed on document-centric uses of XML to the
detriment of data-centric uses.

Speaking of which, anyone know of any formal normalization processes for XML
data similar to those that exist for relational data? Specifically have there
been discussions or papers on ways to create an XML equivalent to 3NF?

--
THINGS TO DO IF I BECOME AN EVIL OVERLORD #231
Mythical guardians will be instructed to ask visitors name, purpose of visit,
and whether they have an appointment instead of ancient riddles.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

References:
- RE: [xml-dev] W3C's five new XQuery/Xpath2 working drafts - Still missing Updates
  - From: Jonathan Robie <jonathan.robie@softwareag.com>

Prev by Date: Is there XML Schema for XSLT?
Next by Date: Re: [xml-dev] W3C's five new XQuery/Xpath2 working drafts - Still missing Updates
Previous by thread: RE: [xml-dev] W3C's five new XQuery/Xpath2 working drafts - Still missing Updates
Next by thread: Re: [xml-dev] W3C's five new XQuery/Xpath2 working drafts - Stillmissing Updates
Index(es):
- Date
- Thread