Re: [xml-dev] XML Schema as a data modeling tool

Michael, you sound as if hierarchy is mainly the result of applying this or that view to a network, the mere results of transformations, without substance of its own. I disagree.

What your arguments rightly underline is that a system cannot (usually) be modeled as a *single* tree. Evidently, there are associations whose representation as containment would be arbitrary (because of a peer relationship) or absurd (because of an essentially referential nature, like the customer "witihin" a booking). But recognizing hierarchy as a key principle of information modelling does not imply an ambition to construct a single tree - it implies a forest of related trees (which in simple cases may consist of a single tree). You point to the fact that concept "booking" could as well be the child of concept "customer" as the child of concept "accommodation". This does not show that hierarchy is artificial, but that these participants cannot be coerced into a single durable hierarchy. It does not prove that the participants might not themselves be successfully modelled as hierarchically organized entities. Divide and conquer. Bookings, customers, accommodations are certainly candidates for hierarchic sub models. In my practical experience - which currently happens to revolve about bookings, customers and accommodations - this is highly adequate. (As mentioned before, one hundred relational tables were gracefully integrated into a single tree which can be read through from root to the tips).

I think that hierarchy is essential to information modelling - not a single, global hierarchy, but local hierarchy, scoped to single entity types, like a booking. So the point is the scoping of hierarchy, the discovery of boundaries, the relationship between hierarchy and granularity. I am afraid that we profoundly disagree, as my position is this:

(a) hierarchy is essential, rather than a representational convenience (a "customer" and his "address" floating side by side lack something compared with a customer having an address)
(b) successful incorporation of hierarchy into the model hinges on the right granularity (your examples are examples of choosing the wrong granularity)
(c) while any hierarchic view can be inverted by transformation, it often makes sense to select one alternative of hierarchic organisation as "canonical" (natural and common) and incorporate it into the model (e.g. let a customer have an address, rather than letting an address "have" a customer)
(d) "tools" are less important than concepts, the role of tree structure in information modelling should not depend on the merits of XSD

Hans-Juergen

Von: Michael Kay <mike@saxonica.com>
An: Hans-Juergen Rennau <hrennau@yahoo.de>
CC: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
Gesendet: 14:13 Montag, 30.September 2013
Betreff: Re: [xml-dev] XML Schema as a data modeling tool

On 30 Sep 2013, at 12:48, Hans-Juergen Rennau wrote:

> Thank you, Michael. You wrote "It [XSD] is a hierarchic model, whereas the real world is a network." I would say, it is as much a network as it is hierarchic. Think of economic structures (e.g. a shop inventory), of administrative structures (a registration procedure), of biological structures (a cell). At any rate the structures I have been dealing with were usually hierarchic, unwieldy and confusing if not dealt with as such, and often straightforward to handle, otherwise. I could show you an ER diagram representing over 100 relational tables storing shopping cart data, and also a single tree representation which can be read like a newspaper. A concise tree representation can be read like a text, conveying a sense of the whole. An ER diagram with many boxes and very many lines is very hard to read. Doubtless you are right in warning about the problems how to model relationships which do not correspond to containment. But I wonder - would you really suggest giving up the benefits of hierarchical modelling, and what is the alternative? You know the German saying, "Not to see the wood because of all those trees", which I suggest to invert, not to see the trees, because they are part of a wood.

I agree, data modelling can easily produce a diagram so complex it fails in its purpose of communicating useful information. The answer to that is to leave out irrelevant detail; at a certain level of understanding, you are only interested in the fact that the system knows about employees, pensioners, and contractors, and not about the detailed information held for each employee &c.

And it's exactly at this high level that you tend to be dealing with a network. Customers book rooms in hotels. If you're interested in the customer's itinerary, then you see a hierarchy of customers and bookings; if you're interested in room availability in the hotel, then you see a hierarchy of hotels and bookings. Two hierarchic views of the same network model.
>
> And then I am unsure in how far your argument applies to XSD specifically, or to XML in general, that is, to models based on node trees and constraints on those trees.

I think it applies to all attempts to do enterprise data modelling with tools that were designed for document modelling. It also gets down to the level of detailed database design; it's not at all obvious what the best design is for the customer/booking/hotel model in an XML database. What XML document should a booking go in? It might be best to organise them not by customer, not by hotel, but rather by day/month/year: a third hierarchic view.

Michael Kay
Saxonica