xml-dev - RE: [xml-dev] [Summary] Better design: "flatter is better" or "nesting i

RE: [xml-dev] [Summary] Better design: "flatter is better" or "nesting i
[ Lists Home | Date Index | Thread Index ]
To: "Costello, Roger L." <costello@mitre.org>, "XML Developers List" <xml-dev@lists.xml.org>
Subject: RE: [xml-dev] [Summary] Better design: "flatter is better" or "nesting is better" ?
From: "Nathan Young \(natyoung\)" <natyoung@cisco.com>
Date: Mon, 10 Oct 2005 10:54:10 -0700
Thread-index: AcXNknBtYJkrUah8TKmSkDqJ1tcVTgALZ/Cw
Thread-topic: [xml-dev] [Summary] Better design: "flatter is better" or "nesting is better" ?
Hi.

I think this topic is incomplete without some comparison of each
format's various abilities to convey information.  The use cases you
give for the vineyard example all use a set of information that CAN be
represented by either format.  But what if lots overlap?  Or, if the
data is representing what the picker did this week, what if he worked
half time on one lot and half on the other?

This moves away from the question "suitability" and into just plain
"ability"... the nested version just can't represent the data (this was
a big sticking point for me when I was introduced to XML as anything
other than document markup... I had been very strongly convinced of the
value of normalizing data)

The lavish and spartan conversation brought this up even more
profoundly... where the "BookTitle" element seems to add NO information,
the "Chapters" element might.  Part of the reason that it can be easier
to take away markup is because to do so creates a "lossful" conversion.
Again it turns from a question of "easy versus hard" to "possible versus
impossible".

If the scope you have chosen is "how to choose between two formats
between which you can convert losslessly" then forgive this as a tangent
(though it might help to make this explicit).

---->N


> -----Original Message-----
> From: Costello, Roger L. [mailto:costello@mitre.org] 
> Sent: Monday, October 10, 2005 5:02 AM
> To: XML Developers List
> Subject: [xml-dev] [Summary] Better design: "flatter is 
> better" or "nesting is better" ?
> 
> Hi Folks,
>  
> Once again, thank you for your outstanding comments on this 
> topic.  Below is my attempt at a summary of our discussions.  /Roger
>  
> 
> Better Design: "Flatter is Better" or "Nesting is Better" ?
> 
> 
> Table of Contents
> 
> 
> 1.	Synonyms of the terms "Nested Design" and "Flat Design" 
> <file:///C:/Documents%20and%20Settings/costello/Desktop/Summar
y-Nested-vs-Flat.html#synonyms>  
> 2.	The Nested Design 
> <file:///C:/Documents%20and%20Settings/costello/Desktop/Summar
y-Nested-vs-Flat.html#nested>  
> 3.	The Flat Design 
> <file:///C:/Documents%20and%20Settings/costello/Desktop/Summar
y-Nested-vs-Flat.html#flat>  
> 4.	Example of an Application that is better-suited to an 
> XML document that uses the Nested Design 
> <file:///C:/Documents%20and%20Settings/costello/Desktop/Summar
y-Nested-vs-Flat.html#application1>  
> 5.	Example of an Application that is better-suited to an 
> XML document that uses the Flat Design 
> <file:///C:/Documents%20and%20Settings/costello/Desktop/Summar
y-Nested-vs-Flat.html#application2>  
> 6.	The Relationship between Applications and XML documents 
> <file:///C:/Documents%20and%20Settings/costello/Desktop/Summar
y-Nested-vs-Flat.html#relationship>  
> 7.	Example of an XML document whose Destiny is to have its 
> Data Mapped into a Relational Database 
> <file:///C:/Documents%20and%20Settings/costello/Desktop/Summar
y-Nested-vs-Flat.html#destinationDB>  
> 8.	Example of an XML document whose Destiny is to have its 
> Data Mapped into a Programming Data Structure 
> <file:///C:/Documents%20and%20Settings/costello/Desktop/Summar
y-Nested-vs-Flat.html#destinationDS>  
> 9.	Should you be Lavish or Spartan with the use of Markup? 
> <file:///C:/Documents%20and%20Settings/costello/Desktop/Summar
y-Nested-vs-Flat.html#lavish>  
> 10.	Should you attempt to "Futureproof" your XML? 
> <file:///C:/Documents%20and%20Settings/costello/Desktop/Summar
y-Nested-vs-Flat.html#futureproof> 
> 11.	Conclusions 
> <file:///C:/Documents%20and%20Settings/costello/Desktop/Summar
y-Nested-vs-Flat.html#conclusions>  
> 12.	Acknowledgements 
> <file:///C:/Documents%20and%20Settings/costello/Desktop/Summar
y-Nested-vs-Flat.html#ack>  
> 
> 
> Synonyms
> 
> 
> Nested design:
>     hierarchical design
> 
> Flat design:
>     relational design, id/idref based design
> 
> 
> The Nested Design
> 
> 
> The nested design is characterized by structuring data using 
> parent-child relationships.
> 
> Example: Consider a grape Vineyard comprised of Lots with 
> Pickers scattered about on the Lots. Here is how the Vineyard 
> data might be structured using the nested design:
> 
> <Vineyard>
>       <Lot id="1"> 
>            <ripe-grapes>4</ripe-grapes> 
>            <Picker> 
>                  <name>John</name>
>                  <metabolism>2</metabolism> 
>                  <grape-wealth>20</grape-wealth> 
>            </Picker> 
>      </Lot> 
>      <Lot id="1"> 
>            <ripe-grapes>3</ripe-grapes> 
>      </Lot>
>      ... 
> </Vineyard>
> 
> Two Lots are shown. Lot 1 has a Picker (John) on it. Note 
> that the Picker data is nested directly within the Lot.
> 
> 
> The Flat Design
> 
> 
> With the flat design there is minimal usage of parent-child 
> relationships. Instead, data fragment A is related to data 
> fragment B by fragment A identifying itself, and fragment B 
> referencing it. Here is how the Vineyard data might be 
> structured using the flat design:
> 
> <Vineyard>
>      <Lot id="1"> 
>            <ripe-grapes>4</ripe-grapes> 
>      </Lot> 
>      <Lot id="2"> 
>            <ripe-grapes>3</ripe-grapes> 
>      </Lot> 
>      ...
>      <Picker locatedOn="1"> 
>            <name>John</name>
>            <metabolism>2</metabolism> 
>            <grape-wealth>20</grape-wealth> 
>      </Picker>
>      ... 
> </Vineyard>
> 
> This version also has Picker John on Lot 1. However, it 
> accomplishes this through the use of an id/idref mechanism: 
> each Lot is uniquely identified, and Picker John references 
> (using the locatedOn attribute) the Lot that he is on.
> 
> 
> Processing the Vineyard by an Application
> 
> 
> Above we examined two ways of structuring the Vineyard data - 
> using a nested design and using a flat design. Suppose that 
> an application receives a Vineyard XML document. Depending on 
> what the application wants to do with the data, one design 
> may be better suited than the other.
> 
> 
> Application 1 - Find all the Pickers on a Lot
> 
> 
> Suppose that the application wants to locate all the Pickers 
> that are on a Lot. For example, suppose that the application 
> wants to locate all the Pickers on Lot 1. Which design is 
> better suited to this application?
> 
> Suitability of the Nested Design: Locating the Pickers on Lot 
> 1 is simply a matter of navigating to Lot 1 and then 
> selecting the Picker children. 
> 
> Suitability of the Flat Design: To locate the Pickers on Lot 
> 1 requires examining each Picker and checking to see if its 
> locatedOn attribute references Lot 1. 
> 
> Conclusion: The Nested Design is better suited to an 
> application that needs to locate Pickers on a Lot.
> 
> 
> Application 2 - Move a Picker to a Different Lot
> 
> 
> Suppose that the application wants to move the Pickers from 
> their current Lot onto a different Lot. For example, suppose 
> that the application wants to move Picker John from Lot 1 to 
> Lot 15. Which design is better suited to this application?
> 
> Suitability of the Nested Design: To move Picker John from 
> Lot 1 to Lot 15 involves a considerable amount of work, as 
> the Picker must be extracted from Lot 1 and inserted into Lot 15.
> 
> Suitability of the Flat Design: Moving Picker John is simply 
> a matter of changing his locatedOn attribute value from 1 to 15.
> 
> Conclusion: The Flat Design is better suited to an 
> application that needs to move Pickers from their current Lot 
> onto a different Lot.
> 
> 
> Lessons Learned
> 
> 
> The above example reveals two things:
> 
> *	How you design your XML can significantly impact its 
> processing by applications. 
> *	Designing your XML in one fashion may make it 
> well-suited for processing by one application, but poorly 
> suited for processing by a different application. 
> 
> 
> Relationship between an Application and an XML Document
> 
> 
> There are three ways to categorize the relationship between 
> an application and an XML document:
> 
> 1.	The application operates directly on the XML document. 
> 2.	The XML document is transformed into some other format 
> (language objects, relational database, etc), and the 
> application operates on the data in that other format. 
> 3.	The XML document is the application. 
> 
> In the above Vineyard example we saw two applications. Both 
> applications operated directly on the XML document. The XML 
> document is akin to a database - it stores data that is 
> queried and processed by applications.
> 
> On the other hand, sometimes an application does not operate 
> directly on the XML document. Instead, the data in the XML 
> document is transformed into some other format, and the 
> application then operates on the data in that format. For 
> example, the data in the XML document is extracted and placed 
> into tables in a relational database. The application then 
> operates on the data in the database. Another common scenario 
> is that the data in the XML document is extracted and placed 
> into data structures in a program. In both of these 
> situations the XML document is merely serving as a transport 
> syntax (i.e., the XML is simply "bits on the wire").
> 
> Sometimes, the XML document is the application! For example, 
> an XSLT document is an XML document, and it is also an application.
> 
> 
> When the XML is being used just as a Transport Syntax which 
> is Better - Nested or Flat?
> 
> 
> Above we discussed the case where an application operates 
> directly upon the XML document. We saw that sometimes it is 
> best to design the XML document using a nested design, other 
> times it is best to design the the XML document using a flat 
> design. The application was the deciding factor.
> 
> Now let's consider the case where the XML document is merely 
> a transport syntax. Applications don't operate directly on 
> the XML document. Instead, they operate on the data after it 
> has been transformed into another format. Let's examine two 
> formats that XML documents are commonly transformed into.
> 
> 
> Format 1: The XML Data is stored into a Relational Database
> 
> 
> Let's suppose that a database contains these two tables for 
> storing the Vineyard data:
> 
> Lot Table	 	Picker Table	
> Lot id	 Ripe Grapes	 Picker id	
> ...	 ...	 ...	
> 53	 3	 Pete	
> ...	 ...	 ...	
>     	 Picker id	 Metabolism	 Grape Wealth	
> ...	 ...	 ...	
> Pete	 1	 10	
> ...	 ...	 ...	
> 
> Let's consider the XML documents to determine which design is 
> better-suited. Note that the XML is merely a (temporary) 
> transport syntax. Also note that the destiny of the XML data 
> is a relational database (containing the above two tables).
> 
> Here is the XML document that uses the nested design:
> 
> <Vineyard>
>       <Lot id="1"> 
>            <ripe-grapes>4</ripe-grapes> 
>            <Picker> 
>                  <name>John</name>
>                  <metabolism>2</metabolism> 
>                  <grape-wealth>20</grape-wealth> 
>            </Picker> 
>      </Lot> 
>      <Lot id="1"> 
>            <ripe-grapes>3</ripe-grapes> 
>      </Lot>
>      ... 
> </Vineyard>
> 
> Although it is achievable, it will clearly entail some effort 
> to map the XML document into the tables. And for an XML 
> document that has a great deal of nesting the mapping could 
> be quite difficult.
> 
> Here is the XML document that uses the flat design:
> 
> <Vineyard>
>      <Lot id="1"> 
>            <ripe-grapes>4</ripe-grapes> 
>      </Lot> 
>      <Lot id="2"> 
>            <ripe-grapes>3</ripe-grapes> 
>      </Lot> 
>      ...
>      <Picker locatedOn="1"> 
>            <name>John</name>
>            <metabolism>2</metabolism> 
>            <grape-wealth>20</grape-wealth> 
>      </Picker>
>      ... 
> </Vineyard>
> 
> This XML document can be directly mapped into the tables.
> 
> 
> Format 2: The XML Data is Stored into Program Data Structures
> 
> 
> Suppose that a program is using this data structure to store 
> the vineyard data:
> 
> struct {
>     int lot_id;
>     int ripe_grapes;  
>     struct {
>         string name;
>         int metabolism;
>         int grape_wealth;
>     }
> }
> 
> The XML document that uses the nested design maps directly to 
> this data structure. The XML document that uses the flat 
> design would require a bit of manipulation to map into this 
> data structure.
> 
> 
> Lessons Learned
> 
> 
> Where the XML document is used as a transport syntax you need 
> to consider the destination for the data. If the destination 
> for the data is a set of relational database tables then the 
> flat design is likely to be better-suited. If the destination 
> for the data is programmatic data structures then the 
> better-suited design could be either the nested design or the 
> flat design.
> 
> 
> Should you be Lavish or Spartan with the use of Markup?
> 
> 
> There are two philosophies with regards to the use of markup. 
> One philosophy is to use markup sparingly. If there is not a 
> definite purpose for a tag, then it shouldn't be used. The 
> other philosophy is to maximize the use of markup, as markup 
> can make processing of the XML document easier, and provides 
> more organization to the data.
> 
> 
> Example of Lavish Markup and Spartan Markup
> 
> 
> Here is an XML document that uses lavish markup:
> 
> <Book>
>     <BookTitle>
>         <Title>The XML Bible</Title>
>     </BookTitle>
>     <Chapters>
>         <Chapter>
>             <Title>An Eagle's Eye View of XML<Title>
>             <Section>
>                 <Title>What is XML?</Title>
>                 <Subsection>
>                     <Title>XML is a meta-markup language</Title>
>                     <Title>XML describes structure and 
> semantics, not formatting</Title>
>                 </Subsection>
>             </Section>
>             <Section>
>                 <Title>Why are Developers Excited About XML?</Title>
>                 <Subsection>
>                     <Title>Design of field-specific markup 
> languages</Title>
>                     <Title>Self-describing data</Title>
>                     <Title>Interchange of data among 
> applications</Title>
>                     <Title>Structured and integrated data</Title>
>                 </Subsection>
>             </Section>
>             ...
>         </Chapter>
>         ...
>     </Chapters>
> </Book>
> 
> Here is the same data, but it has stripped out some tags 
> (i.e., it uses spartan markup):
> 
> <Book>
>     <Title>The XML Bible</Title>
>     <Chapter>
>         <Title>An Eagle's Eye View of XML<Title>
>         <Section>
>             <Title>What is XML?</Title>
>             <Subsection>
>                 <Title>XML is a meta-markup language</Title>
>                 <Title>XML describes structure and semantics, 
> not formatting</Title>
>             </Subsection>
>         </Section>
>         <Section>
>             <Title>Why are Developers Excited About XML?</Title>
>             <Subsection>
>                 <Title>Design of field-specific markup 
> languages</Title>
>                 <Title>Self-describing data</Title>
>                 <Title>Interchange of data among applications</Title>
>                 <Title>Structured and integrated data</Title>
>             </Subsection>
>         </Section>
>         ...
>     </Chapter>
>     ...
> </Book>
> 
> Note that the lavish markup version uses <BookTitle> and 
> <Chapters>. These tags are considered unnecessary and thus 
> omitted in the spartan markup version.
> 
> Which is better - lavish markup or spartan markup? There is 
> no clear answer. It should be noted, however, that "it's 
> easier to remove markup than add markup". So, as the above 
> XML is passed from application to application, an application 
> can easily remove the <BookTitle> and <Chapters> tags if it wants to.
> 
> 
> Should you attempt to "Futureproof" your XML?
> 
> 
> Answer: No. The future is unknown and unknowable. It's best 
> to get the simplest possible XML working that meets today's 
> requirements. Make changes in the future as required.
> 
> 
> Conclusions
> 
> 
> What's the best way to design an XML document? Here are some 
> things to consider:
> 
> *	Will applications directly operate on the XML document? 
> Is the XML document intended to serve as a storage medium 
> that is to be queried by applications? 
> *	Or is the XML document merely a transport syntax? Once 
> it arrives at its destination the data is extracted and 
> placed into another format, and applications process the data 
> in that format. 
> *	In the case of applications operating directly on the 
> XML document, you must consider the types of processing that 
> the application will perform. Some processing is 
> better-suited with XML documents that use a nested design, 
> while other processing is better-suited with XML documents 
> that use a flat design. 
> *	In the case of XML documents that are merely transport 
> syntax, you need to consider the destiny of the data, that 
> is, you need to consider the format that the data will be 
> placed into. 
> 
> 
> Acknowledgements
> 
> 
> Many thanks to the following people who participated in the 
> discussions:
> 
> *	Len Bullard 
> *	Michael Champion 
> *	Joe Chiusano 
> *	Cheryl Connors 
> *	Roger Costello 
> *	Mary Holstege 
> *	Diane Howard 
> *	Peter Hunsberger 
> *	Rick Jelliffe 
> *	Michael Kay 
> *	Ken Laskey 
> *	Anne Thomas Manes 
> *	Ken North 
> *	Dave Pawson 
> *	Michael Rys 
> *	Scott Renner 
> *	Doug Schepers 
> *	Andrezej Jan Tarimina 
> *	Dan Vint 
> *	Nathan Young 
> 
>
Prev by Date: Versions and profiles in RDDL
Next by Date: Re: [xml-dev] Versions and profiles in RDDL
Previous by thread: Versions and profiles in RDDL
Next by thread: [xml-dev] [ANN] Candle 0.8 - a new scripting language for XML
Index(es):
- Date
- Thread