OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [xml-dev] Caught napping!

> XML is a carrier of data plus context, data plus metadata.  This is
> obvious, but is often lost in document design.  When I look 
> at a lot of
> schemas out there these days, they are often blatantly 
> influenced by an
> underlying relational mentality.  This is not  a bad thing if your
> eventual target is an existing relational database.  My question is
> this: If the ersistence mechanism used were not relational, but if it
> treated XML as XML in a very fundamental way- symmetrically handling
> data and metadata, for example- would schema design change in any
> fundamental ways, particularly for "new" data?

If you start with the concept of a single centralized store of information
that can be queried in many different ways by many different users, then the
discipline of normalizing the data has some merit - though not necessarily
normalizing it quite as far as the 3NF, 4NF, 5NF.. theory says you should.
But if you start with the idea of defining "messages" or "documents" -
chunks of information transmitted between applications or individuals - then
you naturally end up with a hierarchic model. In my view, the concept of a
native XML database is most valuable when the requirement is of the form
"I've got all this information flying around in messages and documents, how
am I going to capture them, manage them, and query them?"; there is less
added-value when you are dealing with what I call the "ledger-book"
operational data.

So I think the answer to the question is, that schema design for a system
based primarily on information interchange is very different from a system
based primarily on information storage and query.

Yes, there's a difference in metadata too, primarily in what happens when it
changes. Generally, a relational database system is intended to act as a
snapshot of the state of the world at a given time, so if the model changes,
the data is updated to conform to the new model. In a document or message
store, the information is much more likely to be a historical record of
activity over an extended period of time, and it's much more likely that
when the model changes, you want to keep old documents unchanged, conforming
to the schema that was current at the time the documents were created. (Of
course these are wild generalizations!). 

Mike Kay
Software AG  

<<attachment: winmail.dat>>