OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: Mechanics of mapping relational data into XML

[ Lists Home | Date Index | Thread Index ]
  • From: Charles Reitzel <creitzel@mediaone.net>
  • To: xml-dev@lists.xml.org
  • Date: Fri, 14 Jul 2000 09:05:25 -0400 (EDT)

On Wed, 12 Jul 2000 23:03:17 -0700, Michael Rys wrote:
>Please note that SQLServer 2000 (and the SQLXML tech preview)
>infer nesting information from the SQL query based on the 
>join/Cartesian product in the FOR XML auto mode mentioned 
>below). One limitation of this (and the approach mentioned in 
>the WebTechnique article) is that you only get single type
>hierarchies (i.e. all children are of the same type, e.g. 
>Order contains Customer) and there is no way to identify 
>sibling instance data of a different type solely based on 
>the resulting rowset of a simple join query.

If I understand you right, what you are saying is that it is difficult to
infer the E-R model from the text of a SQL query.  This is absolutely true
and is a side-effect of the SQL declarative model.  Typically, however, the
code generating XML from query results may use additional meta-data to drive
the process.  

All decent RDBMSes have primary and foreign key information in the data
dictionary.  Decent ODBC and JDBC drivers will expose this information.
Analysis of these data will yield the name of the "detail" table, i.e. the
table which will govern the number of rows returned from the RDBMS.  If SQL
meta-data is not available, perhaps the XML schema or DTD content model
could be put to similar use.  

Presumably, there is a mapping of SQL table and column names to XML element
and/or attribute names. SQL syntax has some flexibility here.  Ex:

SELECT Order.ORD_ID          AS OrderId, 
       Order.ORD_DT          AS OrderDate,
       Item.ORD_ITEM_NUM     AS ItemNumber, 
       Item.ORD_ITEM_PROD_ID AS ItemProductId
       Customer.CUST_NAM     AS CustomerName
       Customer.CUST_SHIP    AS ShippingAddress
 WHERE Order.ORD_ID = Item.ORD_ID     /* Join condition */
   AND Order.ORD_CUST_NUM = Customer.CUST_NUM  /* "" */
   AND Order.ORD_DT > '2000-04-01'

Using ODBC and, I believe, JDBC, only the internal (all caps) names will be
visible to the application.  Simple queries like this are not difficult to
parse.  Complex queries may not be mappable to XML at all.

So, in a nutshell the question is, should the Customer element be included
in the result set.  I think you answer this question  by first, according to
the foreign key relationships, determining that the E-R hierarchy goes from
Customer to Order to Item.  Second, look at the ORDER BY clause.  Since this
query is grouping orders for each customer together, it would make sense for
the XML document to look something like this:

    <CustomerName>Joe Bob</CustomerName>
    <ShippingAddress>Black Hills, South Dakota

If the CUSTOMER tables was *not* included in the ORDER BY clause, then there
are several options: 1) make the <Customer> element a child of each order,
2) simply fold the CUSTOMER columns into the <Order> element or 3) include a
section of <Customer> elements at the top of the result set and include
links from <Order> elements to the <Customer> elements.

This doesn't look like a one-size-fits-all issue.  This type of issue is one
reason why OODBMS'es haven't taken the world by storm.  The SQL database (or
something like it) contains the static model. Different applications make
use different, dynamic "projections" of the shared data.  Given the
complexity of SQL SELECT statements.  The 3 options above don't give the
entire picture.   However, the result is always tabular, so you can always
revert to a flat <resultset> structure.

Charles Reitzel


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS