[
Lists Home |
Date Index |
Thread Index
]
- To: Jim Melton <jim.melton@acm.org>
- Subject: Re: [xml-dev] Early Draft Review: XQuery for Java (JSR 225)
- From: Burak Emir <Burak.Emir@epfl.ch>
- Date: Mon, 14 Jun 2004 19:05:19 +0200
- Cc: xml-dev@lists.xml.org, jsr-225-comments@jcp.org
- In-reply-to: <6.0.0.22.2.20040608162857.05bbf708@gmstimap.oraclecorp.com>
- References: <20040527220043.IFPQ20971.mta09-svc.ntlworld.com@Turtle> <40B6F9F1.1080104@epfl.ch> <6.0.0.22.2.20040608162857.05bbf708@gmstimap.oraclecorp.com>
- User-agent: Mozilla Thunderbird 0.5 (X11/20040208)
Jim,
Sorry for the late post. Things can go unnoticed pretty easily in my
xml-dev folder ...
Jim Melton wrote:
>>>>
>>>>> I just cannot believe that one can seriously want to repeat
>>>>
>>>> the IMHO biggest design error of JDBC, namely to pass *strings* to
>>>> the library.
>>>>
>>>>
>>>>> This effectively kills all possibilities of static (compile-time)
>>>>> verification of queries, like syntax checking (let alone types).
>>>>
>
> Well, that is the result of very serious discussions and technical
> considerations. It is obviously not perfect, but it is a compromise
> that has worked
Oh I do not doubt that, but since these technical considerations are
limiting choices, their design value is questionable.
> very well for much of the community. As you may be aware, SQLJ
> provided an SQL-emedded-in-Java technology that provided a great deal
> more static query verification. However, that approach does not
> address all of the requirements of real applications, many of which
> need to generate queries on the fly in response to unpredictable user
> instructions.
>
Tell me about it. I must admit never having seen SQLJ, but it is not
hard to improve on JDBC. It just involves using the most basic standard
technique from compiler construction - that is fixing an abstract syntax
tree representation of a query.
>>
>>> There were many systems that used that kind of language binding in
>>> the 1970s
>>> and 1980s, typically with COBOL and PL/I: the Codasyl DML was based
>>> entirely
>>> on this model. When relational systems became popular in the mid 80s
>>> embedded SQL was used with C, but it was overtaken in the market by
>>> "call-level interfaces" that supplied DML statements as strings.
>>
>
> This movement was done for very good reasons; see below for more on
> this topic.
>
>
Times changed. For instance, in the 70s and 80s memory was so limited as
to pose a real problem to compiler writers. Research was done on compact
symbol table representations. Nowadays, we eschew cryptic encodings in
favour of readable and maintainable code.
>
> The popularity of call-level interfaces (such as SQL/CLI, the most
> important implementation of which is ODBC) has little or nothing to do
> with the "inconvenience of preprocessing". In fact, most vendors of
> embedded SQL systems did (do!) not require a preprocessing step at
> all, but compile the embedded programs directly into invocations of
> --- guess what --- proprietary call-level interface operations.
>
> I can assure you that there were no "great political battles" in the
> SQL standards community when the embedded Ada/SQL bindings were being
> standardized. In fact, the closest thing to such a battle was the
> debate regarding the importance of even having such a binding.
>
I cannot say anything on this reaction to Michael Kay's look on the
history of SQL bindings.
> Call-level interfaces such as SQL/CLI improve the *dynamic* behavior
> of applications. It's not merely the ability to use a variety of
> programming languages, nor the desire to avoid committing to a
> specific database product, although both of these played a part.
> Instead, it is the necessity of having the ability to dynamically
> formulate database queries (including both retrieval and
> modifications) in response to users' actions in some GUI tool. This
> *can* be done in embedded SQL through the use of the dynamic SQL
> facilities (PREPARE and EXECUTE), but that struck too many application
> implementors as a bit awkward compared to pure functional interfaces.
> The bottom line, as you said, is that there are many benefits in late
> binding.
>
We agree that building queries at runtime is useful (if not *the* main
requirement).
But representing queries as strings is just not a high-level interface.
Half of JDBC is filling in the prepared statements with values. If
prepared statements would notify the programmer of syntax of type
errors, development times would shrink tremendously.
You don't want to tell me that building a fixed query with parameters to
be filled in at runtime is far off reality?
>>>
>> With algebraic types, it would be possible to build up an abstract
>> syntax tree, which would be passed to the driver before shipping to
>> the database.
>>
>> In Pizza (or Haskell or ocaml or Scala) one would do like this:
>
>
> Gee, I'm sure glad that people continue to invent new languages faster
> than I can learn how to spell their names ;^)
>
>
I am happy that people are free to ignore even the most simple ways to
help programmers, so that language developers will never run out of work :)
Maybe on a more serious note I should remind you that Java has
everything it takes to design an object-oriented interface worthy of
this name.
>
> In fact, the designers of SQL/CLI were vividly aware of the problems
> of "syntax errors popping up at runtime". Nonetheless, the risks were
> felt to be worth the benefits, if for no other reason than we expected
> the SQL statements being submitted through such call-level interfaces
> to be computer-generated, not hand-coded by some user sitting at a
> screen.
>
But at some point, the SQL statement has to be written. It is clear that
a nice way would be to build a library which just cannot fail to produce
syntactically correct SQL code. But why is this library not part of the
standard ? why is every user forced to write something like this on his
own ? Maybe with SQL, there were problems concerning the choice of
datatypes, vendor-neutrality.
With XQuery, there cannot be much choice of datatypes or
vendor-neutrality, because types in XQuery are specified.
> I don't understand your comment about "depriving them of a real
> interface (like JDBC)" in relation to XQJ. We (the WG defining XQJ)
> are very consciously modeling relevant aspects of XQJ after JDBC. The
> fact that the name of this call-level interface doesn't start with the
> letter "J" has nothing to do with the nature of the interface, but to
> do with Sun's rules about the rules that JSRs must follow if they are
> to be permitted to start with the word "Java".
>
Well you sure do not have to repeat the mistakes of JDBC. Just allow a
way to pass in syntax trees to the XQJ driver. You will realize that
driver implementors and users will quickly neglect your string-based
"interface" and prefer the real one.
> I wonder, perhaps, if you are thinking in terms of SQLJ (a/k/a
> SQL/OLB), which is a standard defining how to embed SQL statements in
> Java programs. However, XQJ has nothing to do with embedding XQuery
> expressions into any programming language, not even Java. Therefore,
> I must conclude that XQJ *is* a "real interface (like JDBC)" with
> respect to this discussion.
>
Yes I am talking about embedding a deomain-specific language in Java,
but purely by means that are offered by Java. No syntax changes, no type
system changes, just an alternative way that hand over queries.
Strings can never be *structured* queries in the Java language.
>> and that higher-level languages and tools can find more elegant
>> solutions.
>>
>> <advertising href="http://scala.epfl.ch">Luckily with Scala there is
>> at least one higher-level language that can interface with Java
>> code.</advertising>
>>
>> But using only strings is clearly the worst one can do. If an
>> application is dealing with XML, it can *at the very least* pass
>> trees to the database. interface. Xquery even has an XML syntax.
>
>
> It is true that there is an XML syntax for XQuery; it's known as
> XQueryX. The published requirements for XQJ consider it important to
> process XQueryX syntax, too. But I can assure you that it's pretty
> unlikely that XQueryX will be passed as any sort of pre-parsed tree
> structure; it will most likely be passed as serialized XML.
>
I am about to get lost here. Maybe this summary clears things up:
facts:
- XQuery(as most other domain-specific languages) has a context-free syntax
- doing anything with string containing a query requires parsing
- parsing a query string yields an abstract syntax tree (AST).
- if out of laziness, one does not want to agree what would be the best
AST for XQuery, one can take XqueryX
- XML hackers are used to work with trees, be they XML or syntax tree
objects.
opinions:
- The early draft on XQJ says strings are cool, syntactically incorrect
strings will lead to runtime errors, there is nothing better.
- Experience in compiler writing says: get away from concrete syntax as
quickly as possible and work only on abstract syntax.
Or put in another way: if you have the choice between AST and string, go
for the AST
You do not loose anything (since a convenience library method can parse
strings and give ASTs)
The difficulty of the task is minor given that XQuery is a clean,
declarative language.
Users and implementors gain the possibility of working with
syntactically correct queries.
cheers,
Burak
|