xml-dev - Re: [xml-dev] Designing XML to Support Information Evolution

Re: [xml-dev] Designing XML to Support Information Evolution

[ Lists Home | Date Index | Thread Index ]

To: "Roger L. Costello" <costello@mitre.org>
Subject: Re: [xml-dev] Designing XML to Support Information Evolution
From: "Chiusano Joseph" <chiusano_joseph@bah.com>
Date: Mon, 17 May 2004 10:34:22 -0400
Cc: xml-dev@lists.xml.org
Organization: Booz Allen Hamilton
References: <000201c43c1a$212cfff0$10395381@MITRE.ORG>

Welcome back Roger,

Ah, a posting about vineyards...just in time for my vacation in Naples
Italy next week. ;) 

This makes a lot of sense to me - please see comments below (far down).

Kind Regards,
Joe Chiusano
Booz | Allen | Hamilton
Strategy and Technology Consultants to the World

> "Roger L. Costello" wrote:
> 
> Hi Folks,
> 
> For the past 4 months I have been working on a demo of a Vineyard in
> which Pickers move around, harvest ripe grapes, eat, and even die.  In
> the process of building this demo I have learned some things regarding
> XML design, which I would like to share.
> 
> When I first started working on the demo I thought that the best way
> to design the XML for the Vineyard/Picker system was a "classical"
> highly structured, hierarchical design.  In fact, this turned out to
> be the worst approach.  It was rigid, it made processing the
> information (e.g., moving the Pickers around, harvesting ripe grapes,
> eating, death) horribly complex, and I wanted to be able to process
> the Vineyard lots in a parallel fashion, which this design totally
> prohibited.  Here was my first design:
> 
> <vineyard>
>     <tract num="1">
>         <lot num="1">
>             ...
>         </lot>
>         <lot num="2">
>             ...
>             <picker id="36">  <!-- Picker #36 on lot #2, tract #1 -->
>                 ...
>             </picker>
>         </lot>
>         ...
>         <lot num="50">...</lot>
>     </tract>
>     ...
>     <tract num="50">...</tract>
> </vineyard>
> 
> As you can see, this design is classical structured data:
>      - the vineyard is comprised of multiple tracts
>      - each tract is comprised of multiple lots
>      - a lot may contain a picker
> 
> Several thousand lines of XSLT code later I decided it was time to
> dump this design.
> 
> My next design "flattened" things out a bit.  I put the Pickers
> physically after the tracts, and each Picker referenced the tract/lot
> that they resided upon using a couple of "ref attributes".  This made
> "moving" the Pickers easy - simply adjust the references.  Here was my
> second design:
> 
> <vineyard>
>     <tract num="1">...</tract>
>     <tract num="2">...</tract>
>     ...
>     <tract num="50">...</tract>
>     <picker id="1">
>         <location tract-ref="13" lot-ref="48"/>
>         ...
>     </picker>
>     ...
>     <picker id="400">
>         <location tract-ref="21" lot-ref="4"/>
>         ...
>     </picker>
> </vineyard>
> 
> With this design my XSLT code dropped from several thousand lines to
> about a thousand lines.  However, this design was still too rigid, and
> made parallel processing of the lots impossible.
> 
> Here is the design that I finally arrived at.  It is extremely
> flexible, amenable to parallel processing, and the code to manipulate
> it is very simple (a couple hundred lines of simple XSLT code).
> 
> <vineyard>
>     <lot tract-num="23" lot-num="5">...</lot>
>     <picker id="36">
>         <location tract-ref="12" lot-ref="29"/>
>          ...
>     </picker>
>     <lot tract-num="3" lot-num="24">...</lot>
>     ...
>     <lot tract-num="1" lot-num="49">...</lot>
> </vineyard>
> 
> The lots have 2 attributes to identify their location.
> Each picker has a location element that has 2 attributes to identify
> the lot it resides on.
> 
> Notice that it is an extremely flat structure:
>     - a vineyard is comprised of lots and pickers  (no more <tract>
> elements)

Of course, this works in this case because the information for tracts is
nothing more than a number. If there were additional properties for
tracts, then you would need a separate set of <tract> elements with
subelements representing their properties.

> Notice that it is an extremely flexible structure:
>     - the order of the lots and pickers is irrelevant
> 
> With this design I can now process each lot on the vineyard in
> parallel.  The other designs forced a sequential processing.
> 
> Here are some lessons I learned.  I believe these lessons apply to all
> XML information structures where you have a requirement to evolve the
> information structure by moving the information (e.g., move the Picker
> around to different lots), changing the information values (e.g., a
> Pickers harvests ripe grapes, thereby decreasing the value of
> <ripe-grapes> on a lot), and where parallel processing of the
> information is desired/needed.  I don't know if these lessons apply
> everywhere.
> 
> 1. How you structure your information in XML has a tremendous impact
> on the processing of the information.
> 
> 2. Hierarchy makes processing information hard!  There exists a
> relationship between hierarchy of information and the complexity of
> code to process the information.  The relationship is roughly: the
> greater the hierarchy, the greater the complexity of code to process
> the information  (Some hierarchy is good, of course.  But the amount
> of hierarchy that is good is probably much less than one might
> imagine, certainly less than I thought, as described above.)
> 
> 3. Flat data is good data!  Flatten out the hierarchy of your data.
> It makes the information flexible and easier to process.
> 
> 4. Order hurts!  Requiring a strict order of the information makes for
> a brittle design.  It is only when I allowed the lots and pickers to
> occur in any order that the flexibility and simplicity kicked in.

Not to start up a permathread, but of course sometimes order is
necessary.

Kind Regards,
Joe Chiusano
Booz | Allen | Hamilton
Strategy and Technology Consultants to the World
 
> Comments?  /Roger

-- 
Kind Regards,
Joseph Chiusano
Associate
Booz | Allen | Hamilton

References:
- Designing XML to Support Information Evolution
  - From: "Roger L. Costello" <costello@mitre.org>

Prev by Date: Designing XML to Support Information Evolution
Next by Date: Re: [xml-dev] Designing XML to Support Information Evolution
Previous by thread: Designing XML to Support Information Evolution
Next by thread: Re: [xml-dev] Designing XML to Support Information Evolution
Index(es):
- Date
- Thread