Lists Home |
Date Index |
On Thursday 12 December 2002 7:38 pm, Roger L. Costello wrote:
> (2) Data models of the raw XML string. There are many ways to model the
> XML. Some sample data models are:
> An XML Schema Data Model of the aircraft:
> - the XML Schema data model
> . declares that the aircraft element is comprised of an altitude
> element. . The altitude element in comprised of an integer that is
> restricted to the range 0-20000.
> An RDF Schema Data Model of the aircraft:
> - the RDF Schema data model
> . aliases aircraft and plane,
> . states that aircraft is a subclass of "FlyingMachine", and
> . constrains altitude to an integer that is restricted to the range
I may just be arguing terminology, but I would say that the data model here
AN AIRCRAFT HAS AN ALTITUDE
...and the RDF/XSD are just ways of writing that rather than different
models. While the XML is a bit of data that happens to fit that model.
> Another Example: Consider a library. Each book in the library represents
> "data". The card catalogue "models" all the data (books) in the library.
> The card catalogue provides data about each book in the library. Thus, the
> card catalogue "data model" provides metadata.
I don't think that's a good example... the card catalogue is just an index.
The data model is something along the lines of "Books may have titles, Dewey
decimal numbers, and ISBNs" - that being a loose data model since some books
don't have these attributes and there are many other attributes a book may
have. The card catalogue may index this information, and in that sense the
design of the catalogue depends on the data model of the book, but that's
nothing to do with the data model per se. But again I may just be arguing
fine points of terminology here, I'm still unsure as to whether we are
discussing the same thing or not...
More formally specified, I would say that my data model for books would be
that they can be modelled as a series of key:value pairs. Some standard keys
that may be present, with the required constraints upon their values, are:
Title - Unicode string
Cover - bitmapped image
Back - bitmapped image
Spine - bitmapped image
Content - sequence of bitmapped images (one per page)
Authors - sequence of Unicode strings
...publisher, copyright details, etc...
...physical construction method...
...Dates of publication, date this particular book was printed...
...catalogue of physical blemishes to this copy...
ISBN - sequence of digits (with length constraint)
DeweyNumber - sequence of natural numbers seperated by '.'
That needs some normalisation, there may be more than one copy of the one
Anyway. That's what I would call a data model. I think a data model is an
abstract concept which can be realised in lots of different ways. DOM is a
data model, as is an Infoset or a PSVI; they are data models of trees, and as
such a bit more abstract.
> b. Data models may be usefully utilized by applications.
> Example. A Purchase Order schema (data model) specifies the valid format
> for a PO. An application may use the Purchase Order schema to validate a
> PO XML string.
Yep; and display it neatly, know what values are equivelant, know how to
normalise values, etc.
> Example of an XML string that is explicitly bound to a data model:
> <?xml version="1.0" encoding="UTF-8"?>
> <aircraft xmlns="http://www.FAA.org"
> Here the XML string has been explicitly bound to an XML Schema data model -
Typing is just part of data modelling... in my example data model for books
above, I defined types for various attributes, but an important thing is that
the attributes are human readable names; the ISBN of a book is a numeric
string, sure, but the computer does not know that it's incorrect if I put the
wrong ISBN in - it will match the types, but it does violate my data model if
we use the ISBN field to count the number of times the book has been taken
out of the library. I would not define this as a data model:
"More formally specified, I would say that my data model for books would be
that they can be modelled as a series of key:value pairs, where the keys are
integers. Some standard keys that may be present, with the required
constraints upon their values, are:
0 - Unicode string
1 - bitmapped image
2 - bitmapped image
3 - bitmapped image"
...I'd call that a set of type rules. Now, to the computer, the XSD is just
typing rules - it's the comments in the XSD (or, more likely, in a document
that is not referenced by the XSD but is read by developers before they start
using the XSD) and the meaningful element/attribute names that make the data
> e. Over time the data model may change.
> Example. Consider the library example above. Over time the card catalogue
> (i.e., the data model) may change:
> - it may be changed from paper form to electronic
> form, or - the data provided for each book may change. For example, we may
> change the filing system from Dewey Decimal to
> Library of Congress
> LESSON LEARNED: data models are "disposable".
See, now there I disagree. For a start the catalogue isn't a data model, it
just depends on it to some degree; my book data model above could be extended
in a later revision to add LoC codes, but the books would still have Dewey
codes even if they didn't get put in the catalogue.
> f. Note that when a data model is "applied" to the parse tree from (1) then
> the parse tree is "decorated" with things such as:
> - the aircraft node is decorated with an alias "plane"
> - the aircraft node is decorated with a "subclass of FlyingMachine"
> - the altitude node is decorated with datatype="integer" restriction (0,
I disagree. I think that if a system is given a mapping from some XML
vocabulary to a data model and a mapping from a Java class to that data
model, then it can map from instances of that XML to instances of that class.
Mappings for my book data model might be that we have a <book> element
containing <attribute name="...key...">...value...</attribute> elements, and
my Java class being java.util.Map. Note that there other vocabularies in XML
and classes in Java that fulfill the same data model in different ways -
that's an important difference between my notion and your notion, I think.
In my notion the data model outlives the XML since that book information can
be mapped into other forms without losing the data model. Indeed, in the
aircraft model, the data model is implemented by an actual aircraft - you can
measure its distance from the ground - as well as by the XML document so the
data model definitely lives beyond the XML.
> (3) Applications process the XML. Applications should be free to use just
> the undecorated parse tree (that is, the parse tree resulting from parsing
> the XML string that has no association with a data model). Several examples
> were presented to demonstrate the value of an application utilizing the
> undecorated parse tree. Alternatively, applications may choose to use the
> decorated parse tree (that is, the parse tree that results from "applying"
> a data model).
XSLT works on the XML rather than the data model; it's a tool that is
specific to one syntax - XML - but independent of application data models, it
just works with a DOM/Infoset-like tree data model which is low level.
An aircraft's user interface works on the data model, but that data model may
be represented by a real aircraft or any of a number of data structures
inside a flight simulator.
The real world, and human thought, is about the data models; software tools
are about data formats.
LESSON LEARNED: Data formats should be disposable, while data models live on,
so use the ASN.1 model :-)
Oh, pilot of the storm who leaves no trace, Like thoughts inside a dream
Heed the path that led me to that place, Yellow desert screen