[
Lists Home |
Date Index |
Thread Index
]
Welcome back Roger,
Ah, a posting about vineyards...just in time for my vacation in Naples
Italy next week. ;)
This makes a lot of sense to me - please see comments below (far down).
Kind Regards,
Joe Chiusano
Booz | Allen | Hamilton
Strategy and Technology Consultants to the World
> "Roger L. Costello" wrote:
>
> Hi Folks,
>
> For the past 4 months I have been working on a demo of a Vineyard in
> which Pickers move around, harvest ripe grapes, eat, and even die. In
> the process of building this demo I have learned some things regarding
> XML design, which I would like to share.
>
> When I first started working on the demo I thought that the best way
> to design the XML for the Vineyard/Picker system was a "classical"
> highly structured, hierarchical design. In fact, this turned out to
> be the worst approach. It was rigid, it made processing the
> information (e.g., moving the Pickers around, harvesting ripe grapes,
> eating, death) horribly complex, and I wanted to be able to process
> the Vineyard lots in a parallel fashion, which this design totally
> prohibited. Here was my first design:
>
> <vineyard>
> <tract num="1">
> <lot num="1">
> ...
> </lot>
> <lot num="2">
> ...
> <picker id="36"> <!-- Picker #36 on lot #2, tract #1 -->
> ...
> </picker>
> </lot>
> ...
> <lot num="50">...</lot>
> </tract>
> ...
> <tract num="50">...</tract>
> </vineyard>
>
> As you can see, this design is classical structured data:
> - the vineyard is comprised of multiple tracts
> - each tract is comprised of multiple lots
> - a lot may contain a picker
>
> Several thousand lines of XSLT code later I decided it was time to
> dump this design.
>
> My next design "flattened" things out a bit. I put the Pickers
> physically after the tracts, and each Picker referenced the tract/lot
> that they resided upon using a couple of "ref attributes". This made
> "moving" the Pickers easy - simply adjust the references. Here was my
> second design:
>
> <vineyard>
> <tract num="1">...</tract>
> <tract num="2">...</tract>
> ...
> <tract num="50">...</tract>
> <picker id="1">
> <location tract-ref="13" lot-ref="48"/>
> ...
> </picker>
> ...
> <picker id="400">
> <location tract-ref="21" lot-ref="4"/>
> ...
> </picker>
> </vineyard>
>
> With this design my XSLT code dropped from several thousand lines to
> about a thousand lines. However, this design was still too rigid, and
> made parallel processing of the lots impossible.
>
> Here is the design that I finally arrived at. It is extremely
> flexible, amenable to parallel processing, and the code to manipulate
> it is very simple (a couple hundred lines of simple XSLT code).
>
> <vineyard>
> <lot tract-num="23" lot-num="5">...</lot>
> <picker id="36">
> <location tract-ref="12" lot-ref="29"/>
> ...
> </picker>
> <lot tract-num="3" lot-num="24">...</lot>
> ...
> <lot tract-num="1" lot-num="49">...</lot>
> </vineyard>
>
> The lots have 2 attributes to identify their location.
> Each picker has a location element that has 2 attributes to identify
> the lot it resides on.
>
> Notice that it is an extremely flat structure:
> - a vineyard is comprised of lots and pickers (no more <tract>
> elements)
Of course, this works in this case because the information for tracts is
nothing more than a number. If there were additional properties for
tracts, then you would need a separate set of <tract> elements with
subelements representing their properties.
> Notice that it is an extremely flexible structure:
> - the order of the lots and pickers is irrelevant
>
> With this design I can now process each lot on the vineyard in
> parallel. The other designs forced a sequential processing.
>
> Here are some lessons I learned. I believe these lessons apply to all
> XML information structures where you have a requirement to evolve the
> information structure by moving the information (e.g., move the Picker
> around to different lots), changing the information values (e.g., a
> Pickers harvests ripe grapes, thereby decreasing the value of
> <ripe-grapes> on a lot), and where parallel processing of the
> information is desired/needed. I don't know if these lessons apply
> everywhere.
>
> 1. How you structure your information in XML has a tremendous impact
> on the processing of the information.
>
> 2. Hierarchy makes processing information hard! There exists a
> relationship between hierarchy of information and the complexity of
> code to process the information. The relationship is roughly: the
> greater the hierarchy, the greater the complexity of code to process
> the information (Some hierarchy is good, of course. But the amount
> of hierarchy that is good is probably much less than one might
> imagine, certainly less than I thought, as described above.)
>
> 3. Flat data is good data! Flatten out the hierarchy of your data.
> It makes the information flexible and easier to process.
>
> 4. Order hurts! Requiring a strict order of the information makes for
> a brittle design. It is only when I allowed the lots and pickers to
> occur in any order that the flexibility and simplicity kicked in.
Not to start up a permathread, but of course sometimes order is
necessary.
Kind Regards,
Joe Chiusano
Booz | Allen | Hamilton
Strategy and Technology Consultants to the World
> Comments? /Roger
--
Kind Regards,
Joseph Chiusano
Associate
Booz | Allen | Hamilton
|