XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Shredding XML

Yes Jim, that is spot on.

Whilst there has been much discussion thus far on the technolgies and
techniques of getting data out of the database (and that has been
interesting), the programming for doing so are 'bread and butter' to
our mainframe Cobol and Sapiens guys, so thats not really my problem.

Mine is the task of getting the data from a fairly complex XML content
model into an appropriately factored relational database. The design
of that database is 'green field' but (and thanks to many on this
thread who have posted related papers) this may not be as easy at it
might at first appear, what with impedence mismatches here there and
everywhere ;-)

Its also the case that the XML data doesn't contain enough data
inherently to represent primary or foreign key values for all of the
relationships that are likely to arise. In some cases I MAY be
permitted to generate them myself (say using a UUID) as I 'walk' the
XML, in other cases I MAY be required to get the database to provide
the value(s), not sure yet. The later may increase the complexity
somewhat (sidenote: our DBAs don't allow stored procs (don't ask)  so
I'm going to be doing whole bunches of INSERTs as part of the
tree-walk I suspect)

I'm really interested in the gotchas and best practices. Some have
already been mentioned like the fact that the XML schema may define
optional items and unrestricted length facets and such like. Others
I've seen in reading talk about the mis-match of identity approaches
(although this was talking primarily about OO/Relational mapping but
the idea is similar I suspect). This could be important, since some
messages received may 'relate' to others already loaded and, given
what I said about not having all of the data in the XML to form all of
the keys, this might be a significant problem.

It is my intention to look into other options (we have recently
acquired DB2 v9 which includes pureXML) but as is so often the case,
the immediate project delivery pressures won't allow it. The PM is
very nervous about using any new tech, perhaps justifiably, but my
sense of unease is more to do with the perhaps misplaced assumption
that 'tried and tested' tech like relational databases will always
provide a workable solution, imho sometimes they actually represent
the most significant constraint.

So yes, back to the actual problem. How to come up with a database
design that provides the capability of staging the shredded XML in a
reasonable efficient manner and enables it to be loaded from XML
instances received, again efficiently (ideally without 100's of tables
and joins to negotiate). As far as efficiency of storage, well that
MAY be a concern although perhaps not a huge one so long as the Db
doesn't bloat up too much if normalisation is preferred over extra
tables.

Please add your thoughts and suggestions and experiences as you are
able. Nothing is too trivial (or rude) to mention (i.e. if you want to
say don't do this if you want to keep your sanity, thats ok).

regards

Fraser.

I'm


2009/11/1 Jim Tivy <jimt@bluestream.com>:
> Interesting post, but I am not sure that "now is the time to talk of many
> things".
>
> Let me try to focus:
>
> Proper software execution comes from the choice of appropriate
> actions/technologies to match the driving requirements.  But more
> importantly, the greatest Wisdom, is to frame the driving requirements
> correctly before "going off half cocked" or doing something that is
> unnecessary and unwarranted.
>
> So lets start by framing the requirements again:
>
> Fraser Gofin wrote:
>
> "
> The basics are we receive XML messages from an external trading partner and
> process those messages, enriching and routing to a number of internal
> subscriber applications. One of these applications is MI and the deal here
> is that they want the data to been put into a relational database so that
> they can create a number of interfaces 'files' which are sent to still more
> applications.
> "
>
> OR
>
> "
> I am mainly interested in the process of LOADING XML data to a database
> rather than extracting (at least for the purposes of this discussion).
> "
>
> It is possible that the "mother persistent application datamodel" is
> contained in the relational database in all its normalized glory.  If so,
> then, "processing the messages" is simply a "data import" operation.  So the
> question is, how do I get XML X* to tables T*.  It would strike me that lots
> of people are doing this.  Are there common techniques and technologies for
> doing this import?
>
> Fraser, is that a proper framing of the question/requirements?
>
> Jim
>
>
> -----Original Message-----
> From: Petite Abeille [mailto:petite.abeille@gmail.com]
> Sent: Sunday, November 01, 2009 9:56 AM
> To: xml-dev@lists.xml.org
> Subject: Re: [xml-dev] Shredding XML
>
>
> On Oct 29, 2009, at 10:20 PM, Fraser Goffin wrote:
>
>> opinions on the subject of decomposing XML into relational databases
>
> Outside of the most trivial case, this is a major PITA of the same
> epic proportion as the object-relational one:
>
> http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
>
> Good luck.
>
>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>
>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS