Hi Folks,
Below I propose a few XML design principles. I am
interested in hearing your thoughts on them, i.e., do you agree or disagree with
them?
Which is Better Design?
Suppose that I
have data about a grape vineyard. Below I show two lots on the vineyard,
and a picker on one of the
lots. I show two ways of designing the data. Which design is
better?
Version #1
<Lot id="1">
<ripe-grapes>4</ripe-grapes>
<Picker
id="John">
<metabolism>2</metabolism>
<grape-wealth>20</grape-wealth>
</Picker>
</Lot>
<Lot
id="2">
<ripe-grapes>3</ripe-grapes>
</Lot>
Version
#2
<Lot
id="1">
<ripe-grapes>4</ripe-grapes>
</Lot>
<Lot
id="2">
<ripe-grapes>3</ripe-grapes>
</Lot>
<Picker id="John"
locatedOn="1">
<metabolism>2</metabolism>
<grape-wealth>20</grape-wealth>
</Picker>
Suppose
that I want an application to move the Picker to Lot 2. With which
version would it be easier to do this?
The above two
versions both have two components:
1. A Lot
component
2. A Picker component
Both versions are
modular, i.e., both represent the Picker located on Lot
1.
However, the two versions differ in three respects:
- Implicit
relationship versus explicit relationship.
- Tight coupling versus loose
coupling.
- Nest (hierarchical) versus flat data.
Implicit versus Explicit Relationships
Version
#1
What is the relationship between the Lot and the Picker?
<Lot id="1">
<Picker
id="John">
<metabolism>2</metabolism>
<grape-wealth>20</grape-wealth>
</Picker>
</Lot>
You might state that the
relationship is:
"The Lot
contains the Picker"
Another person might state that the relationship is:
"The Lot has a
Picker"
The relationship between the Lot and the Picker is
implicit. Implicit relationships are bad because one person may interpret
the relationship in one way, another person may interpret the relationship in
another way.
Version #2
What is the relationship between the Lot and the Picker?
<Lot id="1">
<ripe-grapes>4</ripe-grapes>
</Lot>
<Picker id="John"
locatedOn="1">
<metabolism>2</metabolism>
<grape-wealth>20</grape-wealth>
</Picker>
Clearly
the relationship is:
"The Picker is
locatedOn the Lot"
The relationship (locatedOn) is explicitly
specified in the instance document!
XML Design
Principle #1
Implicit relationships are bad. They are
subject to misinterpretation.
Explicit relationships are good. They
are not subject to misinterpretation.
Tight Coupling vs
Loose Coupling
Compare the two versions with respect to how
loosely connected the Lot and Picker components are:
Version
#1
<Lot
id="1">
<Picker
id="John">
<metabolism>2</metabolism>
<grape-wealth>20</grape-wealth>
</Picker>
</Lot>
The Picker is nested (buried)
within the Lot ... Tight coupling!
Version
#2
<Lot
id="1">
<ripe-grapes>4</ripe-grapes>
</Lot>
<Picker id="John"
locatedOn="1">
<metabolism>2</metabolism>
<grape-wealth>20</grape-wealth>
</Picker>
The
Lot and Picker components are physically completely separate. The only
connection between them is the pointer from the Picker to the Lot (the locatedOn
attribute) ... Loose coupling!
XML Design Principle
#2
Tight coupling is bad. It makes processing
difficult.
Example: moving the Picker from Lot 1 to Lot 2 in
the tightly coupled version is difficult in terms of the amount of code needed,
memory needed, and processing time needed.
Loose coupling is
good. It makes processing easy.
Example: moving the
Picker from Lot 1 to Lot 2 in the loosely coupled version is easy - just change
the value of locatedOn.
Nested (hierarchical) vs Flat
Data
Compare these two versions with respect to the physical
placement of the Lot and Picker components:
Version
#1
<Lot
id="1">
<Picker
id="John">
<metabolism>2</metabolism>
<grape-wealth>20</grape-wealth>
</Picker>
</Lot>
There is a "box within a box" ...
It is nested (hierarchical) data
Version
#2
<Lot
id="1">
<ripe-grapes>4</ripe-grapes>
</Lot>
<Picker id="John"
locatedOn="1">
<metabolism>2</metabolism>
<grape-wealth>20</grape-wealth>
</Picker>
There
are two separate "boxes" ... It is flat data
XML
Design Principle #3
Minimize the amount of nesting you
use.
Nested data is tightly coupled and uses implicit relationships, both
of which are bad.
Flat data is good data!
Flat data is loosely
coupled and promotes the use of explicit relationships, both of which are
good.
Comments? /Roger