> Do your applications that use the relational database need to update and do
> they need to produce reports that present the information in multiple ways
> hierarchial ways? This may suggest whether your database needs to be
> normalized (approaching 3rd Normal form)?
>
> -----Original Message-----
> From: Fraser Goffin [mailto:
goffinf@googlemail.com]
> Sent: Monday, November 02, 2009 3:31 PM
> To:
xml-dev@lists.xml.org
> Subject: Re: [xml-dev] Shredding XML
>
> Hi Jim,
>
> thats interesting ... which should be the 'driving' schema, XML or Db ?
>
> I guess I've been somewhat tiptoe'ing around this one.
>
> I should admit my bias if its not already apparent. I work mainly in
> the SOA integration space and since XML is the primary exchange format
> and XML schema does a reasonable job as the type system, I favour
> processing XML as .. well XML ... whilst I understand the argument
> around leveraging existing technologies and skillset .. often-times
> this is little more than protectionism and continually [de]composing
> from XML to objects then to CopyLibs and then to relational just seems
> unnecessary a lot of the time (sorry - soap box over). But of course
> the whole world isn't XML and just like most other large organisations
> the vast majority of our processing capability and data isn't and
> probably never will be ... I have no issue with that.
>
> On the one hand it is the end product that drives the design (even if
> that design has a relatively short shelf-life ... but hey, we all do
> agile right). In that case it is definately the Database schema that
> prevails from the pure delivery point of view, since this is the
> desired source for the staging area from which to produce interface
> files for upstream applications. At present there appears to be no
> possiblility of revisiting that choice. At the same time, I don't want
> to 'paint myself into a corner' or promote this as an exemplar for all
> future approaches (unless it turns out that way :-)
>
> My unease is around the brittleness of the database schema in the face
> of change, but I suppose that situation is almost inevitable since I
> can't crystal-ball what changes might be coming along next week and
> its probably folly to try. XML changes that dynamic, but not
> completely.
>
> I have been having this internal debate about, .. if I concede I'm
> going to have a relational database then should its design be derived
> from the XML schema or should the XML schema change to accomodate the
> database, indeed one of the Solution Designers on this has already
> indicated a desire to 'flatten' the XML schema (although I have to say
> I disagree that it is necessary). I have some degree of opportunity to
> change the XML schema (although messages
> are received from external sources, within reason, I can transform
> them into any 'shape' I like so long as thats a loss-less exchange).
> The database is green field so it can be any shape, but clearly some
> designs are going to lend themsleves better than others to XML mapping
> I would have thought ?
>
> Surely there are some structures in XML that don't map
> straight-forwardly. Ted Neward called this the 'last mile' (a familiar
> term to us all I'm sure), where the illusion of a high fidelity
> solution draws us in, and indeed 80%+ appears to go quite well, but
> that last few % hold a disproportiate cost and increasing complexity
> (but you don't realise that until late on at which point some are
> going to object to a rethink). I want to know where that 'last mile'
> lives so I can try and avoid it !
>
> Fraser.
>
> 2009/11/2 Jim Tivy <
jimt@bluestream.com>:
>> Fraser
>>
>> I am not entirely hearing firm commitment that you plan to establish an
> RDB
>> schema and make it the driving schema. In other words, what this would
> mean
>> is that data elements cannot be put into the RDB unless they exist in the
>> RDB schema. For example, if some new data elements show up in some
> external
>> XML to be imported then the DBA decides whether to allow them into the
>> appropriate RDB column or not, or drop them for the time being.
>>
>> Another option (from the infinite number) would be to let the XML schema
>> generate the RDB schema and the mapping code. For your application
>> programmers using SQL on the RDB this would likely lead to gagging and
>> hacking and an "out of body experience" This is not something I would
>> recommend and if this is what you want then get a database that supports
>> XQuery and retrain your developers.
>>
>> But I think you have to choose between these two - the first being what it
>> sounds like you want - then work backwards from that decision.
>>
>> Jim
>>
>> -----Original Message-----
>> From: Fraser Goffin [mailto:
goffinf@googlemail.com]
>> Sent: Monday, November 02, 2009 12:22 PM
>> To:
xml-dev@lists.xml.org
>> Subject: Re: [xml-dev] Shredding XML
>>
>> Yes Jim, that is spot on.
>>
>> Whilst there has been much discussion thus far on the technolgies and
>> techniques of getting data out of the database (and that has been
>> interesting), the programming for doing so are 'bread and butter' to
>> our mainframe Cobol and Sapiens guys, so thats not really my problem.
>>
>> Mine is the task of getting the data from a fairly complex XML content
>> model into an appropriately factored relational database. The design
>> of that database is 'green field' but (and thanks to many on this
>> thread who have posted related papers) this may not be as easy at it
>> might at first appear, what with impedence mismatches here there and
>> everywhere ;-)
>>
>> Its also the case that the XML data doesn't contain enough data
>> inherently to represent primary or foreign key values for all of the
>> relationships that are likely to arise. In some cases I MAY be
>> permitted to generate them myself (say using a UUID) as I 'walk' the
>> XML, in other cases I MAY be required to get the database to provide
>> the value(s), not sure yet. The later may increase the complexity
>> somewhat (sidenote: our DBAs don't allow stored procs (don't ask) so
>> I'm going to be doing whole bunches of INSERTs as part of the
>> tree-walk I suspect)
>>
>> I'm really interested in the gotchas and best practices. Some have
>> already been mentioned like the fact that the XML schema may define
>> optional items and unrestricted length facets and such like. Others
>> I've seen in reading talk about the mis-match of identity approaches
>> (although this was talking primarily about OO/Relational mapping but
>> the idea is similar I suspect). This could be important, since some
>> messages received may 'relate' to others already loaded and, given
>> what I said about not having all of the data in the XML to form all of
>> the keys, this might be a significant problem.
>>
>> It is my intention to look into other options (we have recently
>> acquired DB2 v9 which includes pureXML) but as is so often the case,
>> the immediate project delivery pressures won't allow it. The PM is
>> very nervous about using any new tech, perhaps justifiably, but my
>> sense of unease is more to do with the perhaps misplaced assumption
>> that 'tried and tested' tech like relational databases will always
>> provide a workable solution, imho sometimes they actually represent
>> the most significant constraint.
>>
>> So yes, back to the actual problem. How to come up with a database
>> design that provides the capability of staging the shredded XML in a
>> reasonable efficient manner and enables it to be loaded from XML
>> instances received, again efficiently (ideally without 100's of tables
>> and joins to negotiate). As far as efficiency of storage, well that
>> MAY be a concern although perhaps not a huge one so long as the Db
>> doesn't bloat up too much if normalisation is preferred over extra
>> tables.
>>
>> Please add your thoughts and suggestions and experiences as you are
>> able. Nothing is too trivial (or rude) to mention (i.e. if you want to
>> say don't do this if you want to keep your sanity, thats ok).
>>
>> regards
>>
>> Fraser.
>>
>> I'm
>>
>>
>> 2009/11/1 Jim Tivy <
jimt@bluestream.com>:
>>> Interesting post, but I am not sure that "now is the time to talk of many
>>> things".
>>>
>>> Let me try to focus:
>>>
>>> Proper software execution comes from the choice of appropriate
>>> actions/technologies to match the driving requirements. But more
>>> importantly, the greatest Wisdom, is to frame the driving requirements
>>> correctly before "going off half cocked" or doing something that is
>>> unnecessary and unwarranted.
>>>
>>> So lets start by framing the requirements again:
>>>
>>> Fraser Gofin wrote:
>>>
>>> "
>>> The basics are we receive XML messages from an external trading partner
>> and
>>> process those messages, enriching and routing to a number of internal
>>> subscriber applications. One of these applications is MI and the deal
> here
>>> is that they want the data to been put into a relational database so that
>>> they can create a number of interfaces 'files' which are sent to still
>> more
>>> applications.
>>> "
>>>
>>> OR
>>>
>>> "
>>> I am mainly interested in the process of LOADING XML data to a database
>>> rather than extracting (at least for the purposes of this discussion).
>>> "
>>>
>>> It is possible that the "mother persistent application datamodel" is
>>> contained in the relational database in all its normalized glory. If so,
>>> then, "processing the messages" is simply a "data import" operation. So
>> the
>>> question is, how do I get XML X* to tables T*. It would strike me that
>> lots
>>> of people are doing this. Are there common techniques and technologies
>> for
>>> doing this import?
>>>
>>> Fraser, is that a proper framing of the question/requirements?
>>>
>>> Jim
>>>
>>>
>>> -----Original Message-----
>>> From: Petite Abeille [mailto:
petite.abeille@gmail.com]
>>> Sent: Sunday, November 01, 2009 9:56 AM
>>> To:
xml-dev@lists.xml.org
>>> Subject: Re: [xml-dev] Shredding XML
>>>
>>>
>>> On Oct 29, 2009, at 10:20 PM, Fraser Goffin wrote:
>>>
>>>> opinions on the subject of decomposing XML into relational databases
>>>
>>> Outside of the most trivial case, this is a major PITA of the same
>>> epic proportion as the object-relational one:
>>>
>>>
>
http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
>>>
>>> Good luck.
>>>
>>>
>>>
>>> _______________________________________________________________________
>>>
>>> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>>> to support XML implementation and development. To minimize
>>> spam in the archives, you must subscribe before posting.
>>>
>>> [Un]Subscribe/change address:
http://www.oasis-open.org/mlmanage/
>>> Or unsubscribe:
xml-dev-unsubscribe@lists.xml.org
>>> subscribe:
xml-dev-subscribe@lists.xml.org
>>> List archive:
http://lists.xml.org/archives/xml-dev/
>>> List Guidelines:
http://www.oasis-open.org/maillists/guidelines.php
>>>
>>>
>>>
>>>
>>> _______________________________________________________________________
>>>
>>> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>>> to support XML implementation and development. To minimize
>>> spam in the archives, you must subscribe before posting.
>>>
>>> [Un]Subscribe/change address:
http://www.oasis-open.org/mlmanage/
>>> Or unsubscribe:
xml-dev-unsubscribe@lists.xml.org
>>> subscribe:
xml-dev-subscribe@lists.xml.org
>>> List archive:
http://lists.xml.org/archives/xml-dev/
>>> List Guidelines:
http://www.oasis-open.org/maillists/guidelines.php
>>>
>>>
>>
>> _______________________________________________________________________
>>
>> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>> to support XML implementation and development. To minimize
>> spam in the archives, you must subscribe before posting.
>>
>> [Un]Subscribe/change address:
http://www.oasis-open.org/mlmanage/
>> Or unsubscribe:
xml-dev-unsubscribe@lists.xml.org
>> subscribe:
xml-dev-subscribe@lists.xml.org
>> List archive:
http://lists.xml.org/archives/xml-dev/
>> List Guidelines:
http://www.oasis-open.org/maillists/guidelines.php
>>
>>
>>
>>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address:
http://www.oasis-open.org/mlmanage/
> Or unsubscribe:
xml-dev-unsubscribe@lists.xml.org
> subscribe:
xml-dev-subscribe@lists.xml.org
> List archive:
http://lists.xml.org/archives/xml-dev/
> List Guidelines:
http://www.oasis-open.org/maillists/guidelines.php
>
>
>
>
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address:
http://www.oasis-open.org/mlmanage/
Or unsubscribe:
xml-dev-unsubscribe@lists.xml.org
subscribe:
xml-dev-subscribe@lists.xml.org
List archive:
http://lists.xml.org/archives/xml-dev/
List Guidelines:
http://www.oasis-open.org/maillists/guidelines.php