OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: Mechanics of mapping relational data into XML

[ Lists Home | Date Index | Thread Index ]
  • From: Ronald Bourret <rpbourret@hotmail.com>
  • To: xml-dev@lists.xml.org
  • Date: Wed, 12 Jul 2000 20:52:32 +0000 (CEST)

Joshua Allen wrote:

>IBM have an XML Extender for DB2 (free download on alphaworks or
>developerworks),
>and MS SQL Server allows selecting to XML using a specified schema or
>automatic
>(e.g. SELECT * FROM authors FOR XML AUTO).  I am sure other vendors can do
>this now, too.  Besides persisting the recordset object to XML (as in ADO
>mentioned
>below) and querying to get XML back, there are plenty of other ways to get
>XML
>from a relational database.  I can think of 5 or 10 at least..

At least. Because Jerry only wants to get data from the database to XML (and 
not the other direction), he has a lot of choices. Most of the 
"template-driven" middleware listed in "XML Database Products" [1] can do 
this, although to get the "grouping" ("nesting" in XML terms) he wants, he 
will probably have to code the joins himself. (Template-driven middleware 
generally processes an XML document with embedded SQL statements, which are 
replaced in the output document by their results.)

The alternative to template-driven software is "model-driven" software, in 
which the software is based on modeling the XML document in some manner and 
then mapping that model to the database. The advantage of model-driven 
software is that it can also transfer data from the XML document to the 
database. It also often saves the user from having to write the SQL 
statements to do the joins as well.

(Transferring data from the XML document to the database is generally not 
supported by template-driven software because of the complexity involved -- 
a template allows the user to place data from the database almost anywhere 
in the resulting XML document, and reversing this action is non-trivial, to 
say the least.)

Note that some model-driven software models the XML document as a set of 
tables:

  <tables>
    <table>
      <row>
        <column>...</column>
        etc.
      </row>
      etc.
    </table>
    etc.
  <tables>

Clearly, this is not what Jerry needs, since it does not perform the nesting 
he needs. Instead, he needs what I call a "tree-of-objects" model, in which 
the XML document is modelled as data-specific objects. In Jerry's case, 
these objects would be:

     Product
   /    |     \
Comp.  Comp.  Comp. etc.

These are then mapped to the database with an object-relational mapping.

Note that two things about Jerry's relational schema (see below) are likely 
to pose problems for model-driven software, including ADO(?). First, the 
data for the Product object is non-normal: it comes from two different 
tables. In a traditional object-relational mapping, a single object has all 
its property values in a single table. It is not clear to me that any of the 
model-driven products on the market will be able to do this in the XML => 
database direction and many (most?) won't be able to do this in the database 
=> XML direction.

The second problem is that table z contains columns called 
component_attribute and component_value. The problem here is that the value 
of the component_attribute column is being used to name elements/attributes 
in the XML document -- again something that model-driven software probably 
can't do but (some) template-driven software can do.

Jerry's database schema
=======================
Table x: one row for each product
  product-id  |  product_name  |  product_manufacturer

Table y: one row for each product
  product-id  |  product_type  |  product_class

Table z: multiple rows for each product
  product-id  |  product_component |  component_attribute | component_value

Desired XML
===========
  <product
        product_id="456"
        product_name="widget2"
        product_manufacturer="manufacturerY"
        product_type="type2"
        product_class="classY">
     <product_components>
        <cover material=plastic color="blue" weight_oz="4"/>
        <base material=metal color="gray" weight_oz="7" />
     </product_components>
  </product>

>In general, it is still quite fast to let the relational
>engine do all of your work, then convert to XML after the fact.

Agreed. It's also a lot easier from the perspective of how much work you 
have to do.

>Another thing to note about the technique Kar Yan mentioned
>below is that you can be fairly database-agnostic.  All of the
>major databases out there seem to have different syntax for
>returning XML directly, but ADO always returns a recordset, no
>matter what database you use.  If you convert from ADO to
>recordset, you can be programming-language agnostic and
>database agnostic.  (If you need your code to run on platforms
>other than Windows and can use pure Java, I am sure that
>Java recordsets now have [or coming soon] ability to persist
>to XML generically).

Note that database-independence is true for virtually all of the middleware 
listed in the product list -- these are almost all based on JDBC, ODBC, or 
OLE DB. Only the database-specific products (such as the DB2 XML Extender or 
SQL Server) don't have this capability.

-- Ron Bourret

[1] 
http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/XMLDatabaseProds.htm
________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS