[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SV: [xml-dev] Caught napping!

To: jens.jakob.andersen@post.dk
Subject: Re: SV: [xml-dev] Caught napping!
From: Dan Weinreb <dlw@exceloncorp.com>
Date: Thu, 8 Nov 2001 11:07:38 -0500 (EST)
Cc: lgrimaldi@neocore.com, xml-dev@lists.xml.org
In-reply-to: <E6CDEF4DFA31524D83641C6B9550AF77BFDF1F@exmbxa501.postdk.net>(jens.jakob.andersen@post.dk)
Reply-to: "Dan Weinreb" <dlw@exceloncorp.com>

   Date: Thu, 8 Nov 2001 09:20:10 +0100
   From: "Jens Jakob Andersen, PDI" <jens.jakob.andersen@post.dk>

here isn't a single goal of "data modelling".  There are
different ways to model data, with different goals, for different
circumstances.  The relational model is intended to address one
certain set of needs and make certain tradeoffs.  Not all
circumstances have these same needs and call for the same tradeoffs.
I have already sent extensive mail on this topic to this mailing list,
so I'll refrain from repeating myself.

					  Stuff that the SQL crowd has been
   doing for decades, and learned to love and live with (as well as argue
   about, even mathematically), we just simply plain forgot.

But if our goals and purposes were the ones that you were tacitly
assuming, we would probably just use the relational model rather than
XML in the first place.  The goal of XML databases is not to displace
and replace relational databases.

   We need to realize, that all the SQL guys, have gotten used to nice
   helpfull features in the RDBMS, such as triggers, rules, joins, views,
   batchupdates, import/export tools, high performance, stored procedures
   etc., that makes life so much easier for you, when you're working with
   data. Remember that the MS in RDBMS stands for "Management System". 

You are mixing together things that are inherent in the relational
model (joins and relational views) with things that have nothing to do
with the relational model and can equally well apply to alternative
data models (triggers, rules, batch updates, import/export tools, high
performance, and stored procedures).  XML database systems can
certainly have everything in the latter category, and some of them
already have many of these things.

   Back to the datamodelling issue:
   I can easily design a relational schema, and I can even validate it,
   using mathematical formulaes and algortihms. This is great. I can
   optimize the data structures, so no matter what new needs my clients
   will make up, I can easily facilitate this, since I didnt get bogged
   down in a document structure. And I can rebuild any structure from my
   data (if I model them correctly.)

   With XML, I can make up a schema/DTD, but unfortunately I haven't got
   any ways to validate this schema. (Hey, where is the first really great
   book on XML Data/Document modelling?)

When you say you can "validate it", what exactly do you mean?  Do you
mean that you can, e.g., determine whether it's in third normal form?

   We need to develop a process for doing XML datamodelling, so that it can
   be become a validated science and not just magical art. Then they even
   might begin to teach this methodology at the universities, just as they
   teach relational algebra and modelling.

It's actually not so hard to apply to XML the fundamental ideas behind
third normal form.  My favorite example is the Collaboration Partner
Agreement XML schema, from the ebXML standard.  See http://www.ebxml.org/specs/ebCCP.pdf.

   Ouch!!!!! A fact about XML structures today, is that they promote data
   redundancy. And unfortunately, storing the customer together with all
   PO's will not guarantee RI as I see it. How do you then ensure, with a
   customer with 100 PO's, that when you change the phone no. of the
   customer, it is guaranteed to be updated on all 100 PO's ?

(That's not a question of "referential integrity" at all.  It's a
question of normalization.)  It's not clear exactly what
representation you are arguing against.  If the functional
dependencies say that each customer can have many PO's but each PO is
from only one customer, you could (but don't have to!) use XML
nesting, along the lines of:

<customers>
  <customer>
    <name>Staples Inc.</name>
    <phone>1-800-555-1234</phone>
    <po> ... </po>
    <po> ... </po>
    ...
  </customer>
</customers>

So the phone number is stored only in one place, as normalization
demands.  If the relationship were many-to-many, you would not use
nesting, but some other mechanism such as ID/IDREF to represent the
relationship, and of course you'd store the customer's phone number
along with the customer information.

   > satisfactory and a real pain to maintain.)  The ability to persist and
   > manage XML "natively" may now give us the choice.  The analogy with

   Which is what I'd still like to see some real life industrial strenght
   examples of, that are not just XML "stores", but real XML DBMS, I mean
   XML Database Management Systems, with all the nice bells and whistles
   that the SQL crowd has gotten used to, joins, views, triggers (insert,
   update, delete etc.), rules, stored procedurs and high performance.

Well, as I said above, you're not going to see "joins" because "join"
is an operation on relations, and XML isn't relations.  The concept of
"view" might have a corresponding concept in the XML world depending
on exactly what aspects of a "view" feature you want.  As for the
other things, sure, and some of it's here now and some of it's coming.

References:
- SV: [xml-dev] Caught napping!
  - From: "Jens Jakob Andersen, PDI" <jens.jakob.andersen@post.dk>

Prev by Date: RE: [xml-dev] DOM or SAX: Sense and Sensibility
Next by Date: Software Pioneer James Clark among Keynote Speakers at XML 2001 Conference and Exposition
Previous by thread: Re: [xml-dev] Caught napping!
Next by thread: RE: [xml-dev] Caught napping!
Index(es):
- Date
- Thread