XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] Shredding XML

Choice - BLOB: Use a CLOB or BLOB column for the entire XML document.  MySQL
has a maximum of 1 or 2 GIG there.  Then process in memory using XSLT,
XQuery or what have you (Saxon 9).  Extract index values to other columns as
necessary to make selective loading faster.

Choice - Package: Buy something that does this slice and dice for you.
Perhaps Progress/DataDirect has something.

Choice - XML column: Create a column of XML type in DB2 or Oracle and do an
XQuery on that column.

Choice - get MS SQL Server.  I think their first approach at supporting XML
was to slice and dice - may still be that way.  From what I can tell
Microsoft's approach was clumsy for alot of uses.

Choice - Native XML database.

You will have to decide which of these to do given your requirements. 

-----Original Message-----
From: Michael Sokolov [mailto:sokolov@ifactory.com] 
Sent: Thursday, October 29, 2009 7:42 PM
To: 'Fraser Goffin'; xml-dev@lists.xml.org
Subject: RE: [xml-dev] Shredding XML

I spent a little while evaluating DB-2 and Oracle XQuery implementations -
didn't go so far as to implement a full-blown system though, I guess because
nobody was holding a gun to my head.  

The whole automated shredding approach strikes me as totally unworkable for
data with any complexity (think about David Lee's 80 table joins), and
unneccessary for simple data, where you might as well map by hand.  One
complication is that a schema is absolutely required, and if the schema
changes, you need to re-run the entire table generation process.  When I
checked, it was looking like it could be quite complex to retain data in
such a case: there didn't seem to be the ability to generate incremental
schema change operations, so probably it would be necessary to migrate data
from an old set of tables to a new set (with the same names!).

Then I considered the approach of storing an XML blob or two attached to a
metatdata record.  I could tell it would have been possible to implement,
and we probably could have gotten it working with some reasonable
computational efficiency in the end system.  However the programming
environment was looking very hostile: there are uncomfortable lexical issues
that arise when embedding XQuery in SQL or vice versa, and the idea of
passing values back and forth between the two different type systems was
making me uneasy.  I also found that the full text support (which for me is
absolutely critical) in DB-2 was lacking - when I checked they were in the
midst of a transition from an older, imperfect but functioning system to a
newer, but less functional one; the situation with full text is probably
better in Oracle; I didn't dig deep enough to find out details.

It does sound as if your data may be more record-oriented than mine, which
is almost always documents written in English or some natural language, with
tagging to make it at least somewhat machine-friendly.  So, as Michael Kay
said, if it's *already* record-oriented data that has just been wrapped up
in angle brackets, you might not run into these problems.

-Mike


> -----Original Message-----
> From: Fraser Goffin [mailto:goffinf@googlemail.com] 
> Sent: Thursday, October 29, 2009 5:20 PM
> To: xml-dev@lists.xml.org
> Subject: [xml-dev] Shredding XML
> 
> This list has been unusually quiet of late so I thought it 
> might be an opportune moment to ask for opinions on the 
> subject of decomposing XML into relational databases, often 
> referred to as 'shredding'.
> 
> My particular interest is related to some work I'm currently 
> engaged in. The basics are we receive XML messages from an 
> external trading partner and process those messages, 
> enriching and routing to a number of internal subscriber 
> applications. One of these applications is MI and the deal 
> here is that they want the data to been put into a relational 
> database so that they can create a number of interfaces 
> 'files' which are sent to still more applications.
> 
> Whilst I would like to consider a pure XML database or even 
> use some of the XML features that are increasingly prevalent 
> in mainstream DB vendor products, clearly putting data into a 
> 'staging' database is one thing, but the capabilities and 
> competances of the applications and application programmers 
> who want to retrieve it is a key factor. So, for the 
> immediate term I might be stuck (if thats fair - probably 
> not) with relational.
> 
> So to better inform myself and maybe help the debate along 
> internally, I am interested in anyone else experience good 
> and bad, of shredding XML data, pitfalls, things to be aware 
> of, good approaches, when to really not do it. All thoughts 
> are welcome.
> 
> I find it intersting the some of the 'big boys' are at least 
> giving the appearance of providing first-class support for 
> XML both in terms of storage options and manipulation 
> capability. IBM for example has pureXML. I haven't used these 
> enough to know if they're just a thin veneer of whether they 
> have real substance and depth, so again your experiences are welcome.
> 
> Regards
> 
> Fraser,
> 
> ______________________________________________________________
> _________
> 
> XML-DEV is a publicly archived, unmoderated list hosted by 
> OASIS to support XML implementation and development. To 
> minimize spam in the archives, you must subscribe before posting.
> 
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org List archive: 
> http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
> 
> 


_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS