OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Early Draft Review: XQuery for Java (JSR 225)

[ Lists Home | Date Index | Thread Index ]
  • To: Jim Melton <jim.melton@acm.org>
  • Subject: Re: [xml-dev] Early Draft Review: XQuery for Java (JSR 225)
  • From: Burak Emir <Burak.Emir@epfl.ch>
  • Date: Mon, 14 Jun 2004 19:05:19 +0200
  • Cc: xml-dev@lists.xml.org, jsr-225-comments@jcp.org
  • In-reply-to: <6.0.0.22.2.20040608162857.05bbf708@gmstimap.oraclecorp.com>
  • References: <20040527220043.IFPQ20971.mta09-svc.ntlworld.com@Turtle> <40B6F9F1.1080104@epfl.ch> <6.0.0.22.2.20040608162857.05bbf708@gmstimap.oraclecorp.com>
  • User-agent: Mozilla Thunderbird 0.5 (X11/20040208)

Jim,

Sorry for the late post. Things can go unnoticed pretty easily in my 
xml-dev folder ...

Jim Melton wrote:

>>>>
>>>>> I just cannot believe that one can seriously want to repeat
>>>>
>>>> the IMHO biggest design error of JDBC, namely to pass *strings* to 
>>>> the library.
>>>>
>>>>
>>>>> This effectively kills all possibilities of static (compile-time) 
>>>>> verification of queries, like syntax checking (let alone types).
>>>>
>
> Well, that is the result of very serious discussions and technical 
> considerations.  It is obviously not perfect, but it is a compromise 
> that has worked 

Oh I do not doubt that, but since these technical considerations are 
limiting choices, their design value is questionable.

> very well for much of the community.  As you may be aware, SQLJ 
> provided an SQL-emedded-in-Java technology that provided a great deal 
> more static query verification.  However, that approach does not 
> address all of the requirements of real applications, many of which 
> need to generate queries on the fly in response to unpredictable user 
> instructions.
>
Tell me about it. I must admit never having seen SQLJ, but it is not 
hard to improve on JDBC. It just involves using the most basic standard 
technique from compiler construction - that is fixing an abstract syntax 
tree representation of a query.

>>
>>> There were many systems that used that kind of language binding in 
>>> the 1970s
>>> and 1980s, typically with COBOL and PL/I: the Codasyl DML was based 
>>> entirely
>>> on this model. When relational systems became popular in the mid 80s
>>> embedded SQL was used with C, but it was overtaken in the market by
>>> "call-level interfaces" that supplied DML statements as strings.
>>
>
> This movement was done for very good reasons; see below for more on 
> this topic.
>
>
Times changed. For instance, in the 70s and 80s memory was so limited as 
to pose a real problem to compiler writers. Research was done on compact 
symbol table representations. Nowadays, we eschew cryptic encodings in 
favour of readable and maintainable code.

>
> The popularity of call-level interfaces (such as SQL/CLI, the most 
> important implementation of which is ODBC) has little or nothing to do 
> with the "inconvenience of preprocessing".  In fact, most vendors of 
> embedded SQL systems did (do!) not require a preprocessing step at 
> all, but compile the embedded programs directly into invocations of 
> --- guess what --- proprietary call-level interface operations.
>
> I can assure you that there were no "great political battles" in the 
> SQL standards community when the embedded Ada/SQL bindings were being 
> standardized.  In fact, the closest thing to such a battle was the 
> debate regarding the importance of even having such a binding.
>
I cannot say anything on this reaction to Michael Kay's look on the 
history of SQL bindings.

> Call-level interfaces such as SQL/CLI improve the *dynamic* behavior 
> of applications.  It's not merely the ability to use a variety of 
> programming languages, nor the desire to avoid committing to a 
> specific database product, although both of these played a part.  
> Instead, it is the necessity of having the ability to dynamically 
> formulate database queries (including both retrieval and 
> modifications) in response to users' actions in some GUI tool.  This 
> *can* be done in embedded SQL through the use of the dynamic SQL 
> facilities (PREPARE and EXECUTE), but that struck too many application 
> implementors as a bit awkward compared to pure functional interfaces.  
> The bottom line, as you said, is that there are many benefits in late 
> binding.
>
We agree that building queries at runtime is useful (if not *the* main 
requirement).

But representing queries as strings is just not a high-level interface. 
Half of JDBC is filling in the prepared statements with values. If 
prepared statements would notify the programmer of syntax of type 
errors, development times would shrink tremendously.

You don't want to tell me that building a fixed query with parameters to 
be filled in at runtime is far off reality?

>>>
>> With algebraic types, it would be possible to build up an abstract 
>> syntax tree, which would be passed to the driver before shipping to 
>> the database.
>>
>> In Pizza (or Haskell or ocaml or Scala) one would do like this:
>
>
> Gee, I'm sure glad that people continue to invent new languages faster 
> than I can learn how to spell their names ;^)
>
>
I am happy that people are free to ignore even the most simple ways to 
help programmers, so that language developers will never run out of work :)

Maybe on a more serious note I should remind you that Java has 
everything it takes to design an object-oriented interface worthy of 
this name.

>
> In fact, the designers of SQL/CLI were vividly aware of the problems 
> of "syntax errors popping up at runtime".  Nonetheless, the risks were 
> felt to be worth the benefits, if for no other reason than we expected 
> the SQL statements being submitted through such call-level interfaces 
> to be computer-generated, not hand-coded by some user sitting at a 
> screen.
>
But at some point, the SQL statement has to be written. It is clear that 
a nice way would be to build a library which just cannot fail to produce 
syntactically correct SQL code. But why is this library not part of the 
standard ? why is every user forced to write something like this on his 
own ? Maybe with SQL, there were problems concerning the choice of 
datatypes, vendor-neutrality.

With XQuery, there cannot be much choice of datatypes or 
vendor-neutrality, because types in XQuery are specified.

> I don't understand your comment about "depriving them of a real 
> interface (like JDBC)" in relation to XQJ.  We (the WG defining XQJ) 
> are very consciously modeling relevant aspects of XQJ after JDBC.  The 
> fact that the name of this call-level interface doesn't start with the 
> letter "J" has nothing to do with the nature of the interface, but to 
> do with Sun's rules about the rules that JSRs must follow if they are 
> to be permitted to start with the word "Java".
>
Well you sure do not have to repeat the mistakes of JDBC. Just allow a 
way to pass in syntax trees to the XQJ driver. You will realize that 
driver implementors and users will quickly neglect your string-based 
"interface" and prefer the real one.

> I wonder, perhaps, if you are thinking in terms of SQLJ (a/k/a 
> SQL/OLB), which is a standard defining how to embed SQL statements in 
> Java programs.  However, XQJ has nothing to do with embedding XQuery 
> expressions into any programming language, not even Java.  Therefore, 
> I must conclude that XQJ *is* a "real interface (like JDBC)" with 
> respect to this discussion.
>
Yes I am talking about embedding a deomain-specific language in Java, 
but purely by means that are offered by Java. No syntax changes, no type 
system changes, just an alternative way that hand over queries.

Strings can never be *structured* queries in the Java language.

>> and that higher-level languages and tools can find more elegant 
>> solutions.
>>
>> <advertising href="http://scala.epfl.ch";>Luckily with Scala there is 
>> at least one higher-level language that can interface with Java 
>> code.</advertising>
>>
>> But using only strings is clearly the worst one can do. If an 
>> application is dealing with XML, it can *at the very least* pass 
>> trees to the database. interface. Xquery even has an XML syntax.
>
>
> It is true that there is an XML syntax for XQuery; it's known as 
> XQueryX.  The published requirements for XQJ consider it important to 
> process XQueryX syntax, too.  But I can assure you that it's pretty 
> unlikely that XQueryX will be passed as any sort of pre-parsed tree 
> structure; it will most likely be passed as serialized XML.
>
I am about to get lost here. Maybe this summary clears things up:

facts:
- XQuery(as most other domain-specific languages) has a context-free syntax
- doing anything with string containing a query requires parsing
- parsing a query string yields an abstract syntax tree (AST).
- if out of laziness, one does not want to agree what would be the best 
AST for XQuery, one can take XqueryX
- XML hackers are used to work with trees, be they XML or syntax tree 
objects.
opinions:
- The early draft on XQJ says strings are cool, syntactically incorrect 
strings will lead to runtime errors, there is nothing better.
- Experience in compiler writing says: get away from concrete syntax as 
quickly as possible and work only on abstract syntax.
Or put in another way: if you have the choice between AST and string, go 
for the AST

You do not loose anything (since a convenience library method can parse 
strings and give ASTs)
The difficulty of the task is minor given that XQuery is a clean, 
declarative language.
Users and implementors gain the possibility of working with 
syntactically correct queries.

cheers,
Burak




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS