OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Better design: "flatter is better" or "nesting is bette

[ Lists Home | Date Index | Thread Index ]
  • To: "Costello, Roger L." <costello@mitre.org>,"XML Developers List" <xml-dev@lists.xml.org>
  • Subject: RE: [xml-dev] Better design: "flatter is better" or "nesting is better" ?
  • From: "Michael Rys" <mrys@microsoft.com>
  • Date: Thu, 29 Sep 2005 12:49:51 -0700
  • Thread-index: AcXFLGtqPZ8+oUxpSKecy4iCTCDvqAAARtWA
  • Thread-topic: [xml-dev] Better design: "flatter is better" or "nesting is better" ?

Your two approaches to not convey the exact same "information structure". In approach 1, the Lot "owns" the Picker, while in approach 2 the Lot does not. In ER modelling terms, in the first case, you model a 1-n weak entity relationship (where if the stronger of the two disappears, so does the weaker). In approach 2 you model a (potential) m-n relationship.
 
In general the problem you are looking at is the general data modeling question: How close does your logical design (the XML structure in your store) be coupled with your conceptual design (the XML structure as seen by the consumer of the data). Many of the comments you make below are appropriate and are closely related to the questions of relational data vs XML data and relational normalizations vs denormalizations.
 
Best regards
Michael


From: Costello, Roger L. [mailto:costello@mitre.org]
Sent: Thursday, September 29, 2005 12:32 PM
To: XML Developers List
Subject: [xml-dev] Better design: "flatter is better" or "nesting is better" ?

Hi Folks,
 
A while ago we discussed the issue of whether it is better to design XML in a flat fashion, or a nested fashion.  What I learned from that discussion is that there are advantages and disadvantages to both design approaches. 
 
I would like to extend the earlier discussion to address this issue,
 
      "If both design approaches have advantages and disadvantages
       then which approach should I take to design my XML?"
 
First I will quickly review the discussion of "flatter is better" versus "nested is better".  Then I will discuss "application-directed XML designs" and "operational-directed XML designs".  At the end I come to a conclusion that I am eager to hear your thoughts on.
 
Approach #1: Nested Data
 
 <Lot id="1">
      <ripe-grapes>4</ripe-grapes>
      <Picker id="John">
            <metabolism>2</metabolism>
            <grape-wealth>20</grape-wealth>
      </Picker>
</Lot>
<Lot id="2">
      <ripe-grapes>3</ripe-grapes>
</Lot>
 
Approach #2: Flat Data
 
<Lot id="1">
      <ripe-grapes>4</ripe-grapes>
</Lot>
<Lot id="2">
      <ripe-grapes>3</ripe-grapes>
</Lot>
<Picker id="John" locatedOn="1">
      <metabolism>2</metabolism>
      <grape-wealth>20</grape-wealth>
</Picker>
 
Both approaches show two Lots, with a Picker on Lot #1.  Let's assume that this is just a snippet of the data.  The real data has hundreds of Lots and hundred of Pickers (scattered around on the Lots). 
 
Note that both approaches have the same "information structure" but have a different "lexical structure".
 
Approach #1 uses a nested design.  Approach #2 uses represents a flat design. 
 
Which is a better design?  To help answer this question I shall consider how well the two approaches support different application processing.
 
Scenario A - Move Pickers 
 
Suppose that processing of your XML predominately involves doing this to the data:
 
        Move the Pickers to other Lots.
 
Approach #1 (nested design) is not well-suited for this problem. Moving a Picker requires much work: remove the Picker data from the first Lot and insert it into the other Lot; this is very expensive.
 
On the other hand, approach #2 (flat design) is ideally suited for this problem.  Moving a Picker is simply a matter of adjusting the attribute "locatedOn".
 
Scenario B - Search for Pickers
 
Next, suppose that processing of your XML predominately involves doing this to the data:
  
        Search for all Pickers on Lot n.
 
Approach #1 (nested design) is ideally suited for this problem.  Simply locate Lot n.  Then all the Pickers that are on the Lot can be found by simply looking inside the Lot.
 
Approach #2 (flat design) is much less suitable for this problem.  An exhaustive search of each Picker must be made to determine if its "locatedOn" attribute points to Lot n.
 
---
 
Advantages of Approach #1 (Nested Design)
 
1. Easy to locate all Pickers on a Lot.
2. Easy to read since the data is in-lined
 
Advantages of Approach #2 (Flat Design)
 
1. Easy to move the Pickers.
2. Easy to map into relational tables.
 
The above is not a complete list of advantages, but you get the idea that each approach has its advantages.
 
Summary
 
The above example reveals two things:
 
- how you design your XML can significantly impact its processing by applications.
- designing your XML in one fashion may make it well-suited for processing by an application, but poorly suited for processing by a different application.
 
Hypothesis #1 - Let your Applications Dictate how your XML should be Designed
 
I will design my XML in accordance with how it will be processed by my applications.  For example, if my applications predominately operate by moving the Pickers around then I will design my XML in a flat fashion.  On the other hand, if my applications spend most of its time searching for Pickers on a Lot then I will design my XML in a nested fashion.
 
Fallacy of Hypothesis #1
 
Your XML design may be ideal with today's applications, but will the XML design be ideal for future applications?  Future applications may be quite different, and the XML design may be very ill-suited. 
 
For example, suppose that today you are using a relational database, so you design your XML in a flat fashion to enable easy mapping from XML to database.  But suppose that in the future you no longer store your data in a relational database, and instead store it in a native XML database.  Your original design may be not well-suited for the native XML database.
 
In other words, designing XML based on current application processing requirements is putting your XML in peril of future application processing requirements.  Your XML is not "future-proof"!
 
Hypothesis #2 - Don't let your Applications Dictate how your XML should be Designed
 
 To "future-proof" your XML don't base your XML design on current application processing requirements.
 
Hypothesis #3 - XML is Simply a "Transport Syntax"
 
XML is just a syntax for use in transporting data across the wire (e.g., across the Internet).  What happens at either end (that is, how the XML is processed) is irrelevant.
 
Hypothesis #4 - XML is Highly "Morphable"
 
Using a technology such as XSLT, an XML document can be transformed into another form.  Thus, if the current form of an XML document is ill-suited to an application then the application should first transform it into a form for which it is better-suited.
 
Hypothesis #5 - Design your XML to an "Operational" Perspective
 
A photographer has a certain way of thinking about photography data, i.e., he has an operational perspective.
 
A military pilot has a certain way of thinking about combat missions, i.e., he has an operational perspective.
 
Design your XML in the same way that an operational user (e.g., photographer, military pilot) thinks about the real world.  Thus, the XML is a mirror of the real world.
 
Conclusions
 
1. Design your XML in the same way that an operational user thinks about the real world.
 
2. Ship your XML across the wire in the operational design form.
   
            XML Design for the wire --> Operational Design
 
3. Consider an application (end-point) that receives your "operationally-formatted" XML.  If the XML is in a form that is ill-suited to the application's processing needs then the application should transform it into a form for which it is better-suited.
 
           XML Design for an endpoint --> Application-specific Design (obtained by transforming the operational design)
 
Comments?  /Roger
 
 




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS