xml-dev - Re: [xml-dev] Designing XML to Support Information Evolution

Re: [xml-dev] Designing XML to Support Information Evolution

[ Lists Home | Date Index | Thread Index ]

To: "Roger L. Costello" <costello@mitre.org>
Subject: Re: [xml-dev] Designing XML to Support Information Evolution
From: Rick Marshall <rjm@zenucom.com>
Date: Tue, 18 May 2004 08:03:45 +1000
Cc: xml-dev@lists.xml.org
In-reply-to: <000201c43c1a$212cfff0$10395381@MITRE.ORG>
Organization: Zenucom Pty Ltd
References: <000201c43c1a$212cfff0$10395381@MITRE.ORG>
User-agent: Mozilla Thunderbird 0.6 (X11/20040502)

hi roger,

you've addressed many of the issues that have been concerning me 
regarding he use of xml in data oriented applications.

i think you've rediscovered one of the principles that make relational 
systems work.

in a hierarchical system some of the semantics is hidden in the 
hierarchy. this makes it ahrd to work with. what you've effectively done 
is produced relational structures - flat ones, with primary keys. 
therefore it should work well.

i'm trying to work out soemthing similar to equate to our associative 
memory models which need the same flatness of data to work.

but it is horses for courses. the hierarchy seems to work ok for 
documents - although i reckon a similar coding to yours would make 
document manipulation easier; and it seems irrelevant when passing 
transactions between data stores because there is very little processing 
to be done - aside from breaking the transaction into bits.

great example and thanks.

rick

Roger L. Costello wrote:

> Hi Folks,
>  
> For the past 4 months I have been working on a demo of a Vineyard in 
> which Pickers move around, harvest ripe grapes, eat, and even die.  In 
> the process of building this demo I have learned some things regarding 
> XML design, which I would like to share.
>  
> When I first started working on the demo I thought that the best way 
> to design the XML for the Vineyard/Picker system was a "classical"  
> highly structured, hierarchical design.  In fact, this turned out to 
> be the worst approach.  It was rigid, it made processing the 
> information (e.g., moving the Pickers around, harvesting ripe grapes, 
> eating, death) horribly complex, and I wanted to be able to process 
> the Vineyard lots in a parallel fashion, which this design totally 
> prohibited.  Here was my first design:
>  
> <vineyard>
>     <tract num="1">
>         <lot num="1">
>             ...
>         </lot>
>         <lot num="2">
>             ...
>             <picker id="36">  <!-- Picker #36 on lot #2, tract #1 -->
>                 ...
>             </picker>
>         </lot>
>         ...
>         <lot num="50">...</lot>
>     </tract>
>     ...
>     <tract num="50">...</tract>
> </vineyard>
>  
> As you can see, this design is classical structured data:
>      - the vineyard is comprised of multiple tracts
>      - each tract is comprised of multiple lots
>      - a lot may contain a picker
>  
> Several thousand lines of XSLT code later I decided it was time to 
> dump this design.
>  
> My next design "flattened" things out a bit.  I put the Pickers 
> physically after the tracts, and each Picker referenced the tract/lot 
> that they resided upon using a couple of "ref attributes".  This made 
> "moving" the Pickers easy - simply adjust the references.  Here was my 
> second design:
>  
> <vineyard>
>     <tract num="1">...</tract>
>     <tract num="2">...</tract>
>     ...
>     <tract num="50">...</tract>
>     <picker id="1">
>         <location tract-ref="13" lot-ref="48"/>
>         ...
>     </picker>
>     ...
>     <picker id="400">
>         <location tract-ref="21" lot-ref="4"/>
>         ...
>     </picker>
> </vineyard>
>  
> With this design my XSLT code dropped from several thousand lines to 
> about a thousand lines.  However, this design was still too rigid, and 
> made parallel processing of the lots impossible.
>  
> Here is the design that I finally arrived at.  It is extremely 
> flexible, amenable to parallel processing, and the code to manipulate 
> it is very simple (a couple hundred lines of simple XSLT code).
>  
> <vineyard>
>     <lot tract-num="23" lot-num="5">...</lot>
>     <picker id="36">
>         <location tract-ref="12" lot-ref="29"/>
>          ...
>     </picker>
>     <lot tract-num="3" lot-num="24">...</lot>
>     ...
>     <lot tract-num="1" lot-num="49">...</lot>
> </vineyard>
>  
> The lots have 2 attributes to identify their location.
> Each picker has a location element that has 2 attributes to identify 
> the lot it resides on.
>  
> Notice that it is an extremely flat structure:
>     - a vineyard is comprised of lots and pickers  (no more <tract> 
> elements)
>  
> Notice that it is an extremely flexible structure:
>     - the order of the lots and pickers is irrelevant
>  
> With this design I can now process each lot on the vineyard in 
> parallel.  The other designs forced a sequential processing.
>  
> Here are some lessons I learned.  I believe these lessons apply to all 
> XML information structures where you have a requirement to evolve the 
> information structure by moving the information (e.g., move the Picker 
> around to different lots), changing the information values (e.g., a 
> Pickers harvests ripe grapes, thereby decreasing the value of 
> <ripe-grapes> on a lot), and where parallel processing of the 
> information is desired/needed.  I don't know if these lessons apply 
> everywhere.
>  
> 1. How you structure your information in XML has a tremendous impact 
> on the processing of the information.
>  
> 2. Hierarchy makes processing information hard!  There exists a 
> relationship between hierarchy of information and the complexity of 
> code to process the information.  The relationship is roughly: the 
> greater the hierarchy, the greater the complexity of code to process 
> the information  (Some hierarchy is good, of course.  But the amount 
> of hierarchy that is good is probably much less than one might 
> imagine, certainly less than I thought, as described above.)
>  
> 3. Flat data is good data!  Flatten out the hierarchy of your data.  
> It makes the information flexible and easier to process.
>  
> 4. Order hurts!  Requiring a strict order of the information makes for 
> a brittle design.  It is only when I allowed the lots and pickers to 
> occur in any order that the flexibility and simplicity kicked in.
>  
> Comments?  /Roger

begin:vcard
fn:Rick  Marshall
n:Marshall;Rick 
email;internet:rjm@zenucom.com
tel;cell:+61 411 287 530
x-mozilla-html:TRUE
version:2.1
end:vcard

References:
- Designing XML to Support Information Evolution
  - From: "Roger L. Costello" <costello@mitre.org>

Prev by Date: Re: [xml-dev] Designing XML to Support Information Evolution
Next by Date: Re: [xml-dev] Designing XML to Support Information Evolution
Previous by thread: RE: [xml-dev] Designing XML to Support Information Evolution
Next by thread: Re: [xml-dev] Designing XML to Support Information Evolution
Index(es):
- Date
- Thread