OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: storing xml files into database



Bill Lindsey wrote:
> 
> Frank Richards wrote:
> > XML is a tree of elements. Naively mapping that tree onto a table causes the
> > RDBMS to
> > thrash it's guts out doing joins to go down the tree --
> [ ... ]
> > XML in an
> > RDBMS can easily hit six or seven joins per query.
> 
> A typical, naive definition of a "nodes" table does lead to unacceptable
> performance due to the necessity of many self-joins.  It is possible,
> however, to devise a scheme for encoding nodes' context in a compact
> form, optimized for an RDBMS' indexing facility, and build a
> generic table structure, capable of storing any well-formed
> XML, yet does not exhibit the self-join problem.

I think you guys are talking apples and oranges. The typical mapping
solution Frank is talking about is (I believe) one that maps the data in
XML documents to tables. For example, if I have a <SalesOrder> element,
I map it to a SalesOrder table. An <Item> element maps to an Items
table, a <Part> element maps to a Parts table, and so on. This obviously
leads to a lot of joins when retrieving data.

For a very nice paper discussing different strategies for minimizing the
cost of these joins, see:

   http://www.cs.wisc.edu/~jai/papers/RelationsToXMLJournal.pdf

As I recall, the most effective strategy was that which UNIONs selects
on different tables. This is the strategy used by SQL Server's SELECT
... FOR XML EXTENDED syntax.

The solution Bill is talking about maps the document structure itself
(e.g. a DOM tree) to the database. Thus, a <SalesOrder> element is
mapped to an Elements table (an actual row of data contains the value
"SalesOrder" in the ElementName column. Similarly, all other elements
are mapped to this table as well.

For a nice paper explaining one way to retrieve data from such a table
without any self-joins, see:

  
http://www.sees.bangor.ac.uk/~rich/research/papers/uwb_rge_IDEAL2000.pdf

-- 
Ronald Bourret
XML, Databases, and Schemas
http://www.rpbourret.com
Speaker, Geek Cruises' XML Excursion '02