Hi Folks,
Many thanks for your
excellent comments!
First, I will attempt to
summarize the points that were raised in your comments. Then, in
the spirit of the philosopher Karl Popper I will boldly propose a solution to
the question: "How should XML documents be designed?"
Points Raised in your
Comments:
(1) "Design for today's
applications. The future is unknown."
Implication: let your
applications dictate how your XML is designed.
(2) When
designing your XML, ask if your applications:
- operate directly on the
data in an XML document, or
- on the data after
being loaded in a (relational) database?
(3) If the data is
not in a form that is well-suited to processing by your applications then
change the form of the data. From point (2) we must consider two
cases when dealing with changing the form of data:
Case 1: Applications operate directly on the data in an XML
document
Implication: write an XSLT stylesheet to transform the (XML) data
into a form that is well-suited to processing by
applications.
Case 2: Applications operate on the data after the data
has been placed into a (relational) database
Implication: modify the database (tables, primary keys, foreign keys,
etc)
Cost of changing the form of
data: Is it cheaper to change a stylesheet or a (relational)
database?
(4) "What
application uses *that* markup? If there isn't one that *needs* it, today,
then get rid of it."
Implication: keep your XML design simple, free of
nonessential tags.
[Tangential Remark:
There is a philosopher by the name of Karl Popper. One of the things
that he is well-known for is his idea that a key characteristic of science is
that all its hypotheses are testable, and science progresses most quickly when
a hypothesis is submitted to a large audience, who then scrutinizes
(tests) it. Thus, in the spirit of Karl Popper I propose the
following hypothesis.]
Hypothesis - How to Design XML
Documents
I am supremely compelled by
the argument that the future is much too uncertain to bother attempting
to anticipate or design for. Thus I put this down as the first part of
this hypothesis:
Part 1: Design your XML documents so that they are
well-suited for processing by your applications *today*.
In other words, how your data
is going to be processed tells you how to design your XML.
A large percentage
(majority?) of applications today operate on the data only after it
is placed into a (relational) database. A smaller percentage
(minority?) of applications operate directly on the data in an XML
document. So, as an 80-20 rule I make the second part of this
hypothesis:
Part 2: Design your XML to be flat, with direct mappings from
XML to (relational) database tables.
I am also supremely compelled
by the argument to keep the markup (tags) to a minimum. So here's
the third part of this hypothesis:
Part 3: Eliminate nonessential markup (tags).
Only use tags that are actually used by your applications
*today*.
To recap - when
designing XML:
- be practical;
- be simple;
- don't use unnecessary tags;
-
design your XML to work well with your applications *today*;
-
most likely, "flatter is better".
Comments?
/Roger