Lists Home |
Date Index |
At 08:29 PM 2/16/2003 -0500, Mike Champion wrote:
>Adam Bosworth says:
>"It is the thesis of this series of articles that [DOM/JDOM,
>SAX/databinding, and XSLT] are only rarely suitable for either waypoint
>[lossless transformation] or endpoint [lossy data extraction] processing.
>- Anyone want to speculate on why one might think that XQuery will be
>vastly better than these other technologies for XML transformations, lossy
>or otherwise? Technically, they are all more or less equivalent (assuming
>one has an XPath library to find patterns in the XML).
I guess another way of phrasing your question would be: Is a language
specifically designed for processing XML going to be better for that task
than languages designed for other purposes? XQuery was designed for
processing XML from day one.
XQuery is all about locating things in XML with path expressions,
constructing arbitrary XML with a syntax that looks just like XML, and
transforming XML to XML using FLWOR and other expressions. The things that
matter in XML matter in XQuery.
Lisp was designed for list processing, FORTRAN was designed for crunching
numbers, Perl was designed for munging text, SQL was designed for
transformations on tables, Java and C++ were designed for managing objects.
These are all Turing complete languages, but they are each best for the
tasks they were designed for. I once wrote a multitasking kernel in FORTRAN
because I had no choice, but it wasn't fun, and it was not a productive
language for the task.
XML isn't objects, and objects are not really a good way to represent an
XML document. But neither is XML just text - elements define nested
structures that need to be managed gracefully. Of course, all these
languages are Turing complete, and you can use any of them to do what the
other languages do, but it isn't efficient, and it doesn't feel good. DOM,
JDOM, and SAX are all libraries used in languages that were not based on XML.
One way to get a feel for this is to work through the XML Query Use Cases
 with Java and the API of your choice, and compare the results to the
I also can't resist illustrating this with an example taken from one of
Sean's emails :
> What is the easiest way to divide a stock price by revenue minus
> expenses? Obviously its something like this:
> Stock = LoadStockFromXML("stock.xml")
> return Stock.price / (Stock.revenues - stock.expenses)
> Any attempt at doing that in DOM/SAX/XSLT/XQUery is always going to
> come a poor second compared to the native language expression of the
Let me show how I would express this in XQuery. I will assume we want to
compute this for the stock of the "ASDF" company:
let $stock := document("stock.xml")//stock[ticker="ASDF"]
return $stock/price div ($stock/revenues - $stock/expenses)
In XQuery, there is no need to write code to parse the XML - that happens
implicitly. There is also no need to retrieve the bits of data by walking
around a tree or intercepting the events, then performing the right casts
to stuff the data into the right variables in the programming language.
XQuery understands XML - in fact, it doesn't really understand anything
else. This query will work equally well for transient XML stored in a
buffer, an XML file stored on disk, persistent XML stored in a database, or
an XML view of some non-XML resource. You just process things as XML,
instead of taking your XML and changing it into something your programming
language knows how to deal with. There's no need for object libraries to
manage the XML.
And if the structure of your XML changes, you don't have to change the code
that walks a tree or processes events, you just change a path expression
here or there.
XSLT is very close to XQuery in many ways, but it is more verbose, less
readable, and relies heavily on template processing, which is not a
familiar approach for many programmers. I don't think it is likely to
become a mainstream programming language. For most people, XQuery is just a
lot easier to read and write. And it certainly is easier to optimize, make
typesafe, or use as the basis for XML views on non-XML systems.