[
Lists Home |
Date Index |
Thread Index
]
> > But all this presupposes that we are designing XML
> documents for storage and
> > query. Most XML documents are designed for messaging of
> some kind (between
> > humans or between software components). Within the context
> of a message,
> > duplication is far less of a problem, for example it
> doesn't matter if I
> > hold product code, description, and price as part of each
> order-line in an
> > order. Many XML databases are actually archives of such messages, so
> > duplication of data is a fact of life; and since it's an
> archive, the update
> > problem doesn't arise.
>
> This is the conclusion I came to.
With all due respect, I couldn't disagree more. In the case of
roundtripping, there is the classic problem of update anomalies. Since the
recieving application may not know what is duplicate (certainly a schema
won't give a clue), there's the risk that some duplicates will be changed by
the receiving application but not others. That means that there has to be
some logic embedded somewhere that (probably in the originating applicaiton)
that performs a validation check that all duplicate values have been
modified; otherwise, you have to pick and choose which values to ignore.
That's not easy, because one of the values may have gotten changed back to
it's original state: was that intended as the 'final' value? Or was it just
a sloppy partial update of other duplicates?
I think if you exchanging data documents, especially between third-party
applications, duplicates should be avoided. You can't denote them in a
schema (unless you key/keyref them all), so the result is that you have an
implicit (or narrative) understanding of what is duplicate. Bad, bad, bad.
For views, though, I agree: normalization isn't necessary.
|