OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xml-dev] storing XML files



Chris Parkerson wrote:

> ... and being able to
> do fast queries across large document sets is NOT a requirement (i.e.
> you've got 1000+ stock quote documents, but you do not need to query
> over them as an aggregate), then the XML capabilities of the RDBMS
> vendors should be sufficient.

I don't understand this statement. What does the number of documents
matter? If each stock quote document holds a single quote -- that is, a
single row of data -- then a relational query should be very fast. As
Soumitra Sengupta pointed out in his reply, I think the issues are
nesting depth and how semi-structured the data is, not size.

> Oracle
> 9i's XML support is really limited with their new vaunted "XMLType"
> column data type being nothing more than a convenience wrapper around
> the CLOB type: XML still gets stored as a blob.

This is only part of Oracle 9i's support. They have two other ways to
map XML to the database. The first uses SQL3 object views to perform the
obvious object-relational mapping between XML documents and the
database. The second is the Internet File System (iFS), which uses
mapping files.

>  Keep in mind, however,
> that each RDBMS vendors approach to handling XML is going to be
> proprietary: you will have little to no code [portability]

Let's take a closer look at this. An XML/database application usually
consists of a number of parts:

1) APIs. These are proprietary in RDBMSs, but also in native XML
databases as well. That is, there is currently no way to write an XML
application that is portable with respect to database access. This is
because there is no widely supported standard API. (There is a good
start in this direction -- the XML:DB API -- but it is not yet widely
enough implemented by database vendors to make truly portable
applications a reality.)

In fact, the only way current to write an XML application that is
portable across databases is to use object-relational middleware against
a relational database. This is because a number of middleware vendors
use ODBC, OLE DB, or JDBC for database access. So while you will be
locked into a single vendor's API, the application will be portable
across databases.

2) Query language. This is also not standardized across databases. While
the most popular query langauge is probably XPath (usually with
extensions for multi-document queries), numerous other query languages
-- all of them proprietary -- are supported as well. This is true of
XML-enabled relational databases as well as native XML databases.

There is good reason for this, as XPath is not rich enough to perform
many of the queries needed by users and XQuery is not yet finished. I
suspect that when XQuery is done, you will see many implementations of
it.

3) Update language. Where these are supported, they are all
non-standard. This is because there is no standard XML update language
in existence. (Again, there is an attempt to standardize this with the
XUpdate language, but there are not enough implementations to make it a
reality.)

4) DOM, SAX, XSLT, namespaces, etc. These are standard across all XML
database products that I have seen -- native XML and XML-enabled.

I therefore think it's fair to say that you'll have roughly the same
amount of code portability with XML-enabled relational databases that
you'll have with native XML databases.

> or knowledge portability.

Again, this is about the same for native XML databases and XML-enabled
relational databases. Both types of databases are based on fairly
consistent models (object-relational mappings for XML-enabled relational
databases and XML document structure for native XML databases) but the
actual implementations are different for every product.

In short, moving code from one XML-enabled relational database to
another means learning new mapping syntax, a new API, and possibly a new
query language. Moving code from one native XML database to another
means learning a new API and possibly a new query language.

> A major advantage of native XML DB systems (save for Software AG's
> Tamino which still uses a proprietary query language and schema dialect,
> among other non-standard things) is their adherence to standards.
> Queries are XPath, transformations are XSLT, granular access to document
> data is via the standard DOM API, validation can be against DTD or W3C
> XML Schema, etc.

This is a rather inflammatory statement, as it seems to imply Tamino and
XML-enabled relational databases do not adhere to standards. With the
exception of query languages, most XML database products that I have
seen (native XML, XML-enabled databases, middleware) *do* adhere to
standards, Tamino included. (With respect to the non-standard schema
language in Tamino, Tamino was released *long* before XML Schemas were
completed, so it hardly seems fair to complain. A year from now, yes.
Today, no.)

>  You gain a bit more in terms of code portability as
> well as knowledge portability between different vendors.

I think the operative term is "a bit more".

>  You also gain
> a database system inherently designed to deal with the extensibility of
> XML data

True.

> and capable of removing the scalability, performance, ease of
> data update, and cross-document limitations of the RDBMS approach.

Actually, this depends a lot on the application. In some cases, this is
definitely true. In others, it is not.

> The reason most XML applications are converting to native
> XML databases is that they've tried to make their applications work
> with RDBMS systems and they failed

Is there any hard data to back this up? Numbers of customers, number of
transactions per day, etc.? While there are certainly applications that
can be built with native XML databases that can't be built with
XML-enabled relational data, the reverse is also true, so I have a hard
time with the word "most".

-- Ron