Re: [xml-dev] RE: XML data interchange format: Flatter isbetter

XML.org

XML.org

FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

Re: [xml-dev] RE: XML data interchange format: Flatter isbetter

From: =?utf-8?Q?Jens_=C3=98stergaard_Petersen?= <oesterg@gmail.com>
To: "=?utf-8?Q?xml-dev=40lists.xml.org?=" <xml-dev@lists.xml.org>, "=?utf-8?Q?Costello=2C_Roger_L.?=" <costello@mitre.org>
Date: Mon, 3 Nov 2014 16:35:31 +0100

Hi,

With XML, you can be both flat and fat, if you flatten your document in document order and keep track of the depth of each element. If you look at <https://stackoverflow.com/questions/20729593/transforming-tree-to-sequence-of-elements> and <https://stackoverflow.com/questions/21527660/transforming-sequence-of-elements-to-tree>, you will get the idea.

There are things you can do with your document in its flat form that are hard to do with it in its fat form. I have used this in a script I have played around with recently, in which I gather the unique paths of a document in a flat document and then use these paths to construct a minimal fat instance of all unique paths, <https://github.com/jensopetersen/tei-compactor>.

Jens

On 29 Oct 2014 at 18:08:33, Costello, Roger L. (costello@mitre.org) wrote:

Thanks for the great feedback!

I received feedback from a couple colleagues. Below is my response to them. /Roger

Scott wrote:

Dr. Scott Law of Data #3:

Hierarchical data structures are great, if it's your hierarchy.

Otherwise they suck.

I like it!

how do I know that <metabolism> is related to <picker>

instead of <Vineyard>?

Great question! I am not sure. Perhaps that knowledge should be made available out-of-band (not hardcoded into the XML)?

David wrote:

the real issue is, what’s the data model separate from XML?

My thinking is that with most data-modeling problems many different models could be created. That is, there is no one, true model.

For example, with my grape vineyard one expert modeler argues that the correct model is this:

                A Lot may have zero or more Pickers on it.

This XML is a perfect representation of that model:

<Vineyard>

    <Lot id="1">

        <ripe-grapes>4</ripe-grapes>

        <Picker id="John">

            <metabolism>2</metabolism>

            <grape-wealth>20</grape-wealth>

        </Picker>

  </Lot>

    <Lot id="2">

        <ripe-grapes>3</ripe-grapes>

    </Lot>

    ...

</Vineyard>

Another expert modeler argues that the correct model is this:

                There are Lots. There are Pickers. A Picker

                may be located on one and only one Lot.

This XML is a perfect representation of that model:

<Vineyard>
    <Lot id="1">
        <ripe-grapes>4</ripe-grapes>
    </Lot>
    <Lot id="2">
        <ripe-grapes>3</ripe-grapes>
    </Lot>
    <Picker id="John" locatedOn="1">
        <metabolism>2</metabolism>
        <grape-wealth>20</grape-wealth>
    </Picker>
    ...
</Vineyard>

So which expert modeler has the correct view of the world? And thus, which XML representation is correct?

It seems to me that both models/representations are well-suited to some consumers and horrible to others.

So I am proposing this: Be model-agnostic in the XML representation. That is, give consumers a flat XML and let them parse it to represent whatever model is best-suited to them.

Thank you!

/Roger

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS