OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xml-dev] Caught napping!

Ronald Bourret wrote:

> Dan Weinreb wrote:
> >    From: Ronald Bourret <rpbourret@rpbourret.com>
> >
> >    1) The XML specification defines a serialization syntax. It does not
> >    define what that syntax models. Thus, in a strict sense, my comment is
> >    correct. However, depending on your view of the world, this may or may
> >    not apply.
> >
> > I think that's true of the "XML spec" proper, but wouldn't you say
> > that the "XML Info Set spec" does talk about what the syntax models,
> > at least to some extent?
> It does. So do SAX, DOM, XPath, XQuery... That's why I had point 2:
> > 2) For many (most?) people, you are correct -- the native XML
> > database models the data in the documents. In this case, the model
> > of the document is a proxy for the objects used to model the data,
> > just like tables and columns are a proxy for the objects in the
> > relational model.
> -- Ron

Ah, but the vast difference is that preserving the model of those tables and
columns requires a very different 'serialization syntax' than XML. CSV, for
example, provides nothing like XML serialization, for the same fundamental
reasons that CSV is not XML:  quite simply, CSV serialization is opaque to any
understanding not predicated on the particular table and column model which it
serializes. It is, in fact, a far closer analogy to say that XML is a
serialization syntax in the same sense that the Shakespearean sonnet form is,
than it is to imply that XML text is predicated on an underlying data model in
the same way that the serialization of a relation must be.

As you point out, SAX, DOM, XPath, XQuery and others are particular conceptual
models imposed at the time of processing upon (and therefore upon the very
particular understanding, for that processing, of) XML instances. So too is the
XML Infoset, despite the way in which, as above, it is often treated as if it
were the a priori model--the Platonic form--of an XML instance. Yet in practice
an infoset is elaborated from an XML instance by the processing of that textual
instance under particular assumptions--precisely as the objects of a DOM or an
XQuery are built from an instance by processing specific to their respective

The schism in XML practice, and underlying theory, is not so much between the
data and document camps as between the text and model viewpoints. That
difference resolves to a fundamental understanding of intent. Proponents of the
model believe that XML instances are to convey particular semantics, with
markup as the tool to circumscribe the most particular definition of expected
meaning. The alternative is to believe that the XML instance may, and should,
be processed by each recipient to yield the most useful semantic outcome for
that recipient on that occasion, regardless of the expectations or intent of
the creator of that instance.

John Cowan argues that if he sends me English and I do not understand English,
then no explanation that he could give me of that instance will enable me to
grasp his meaning, as the conveyance of that meaning is predicated on my
understanding of English or, more precisely, on my sharing with him a body of
assumptions about semantics, which can be summarily described by saying that
the instance is in English. Yet because he cannot know--nor dictate--the use
which I might make of his XML instance, he may not presume that an
understanding of English, or of his intent, is a condition of the use I make of
that instance. This is not a flippant nor a trivial point. The inherent
advantage of XML as marked-up text is apparent at both ends--at both the
creation and the use of that instance. Unlike CSV as the serialization of a
relation, the XML textual instance neither requires a particular data model at
its creation, nor can it be forced in the markup of the instance itself to
convey one and only one possible model. Similarly for its recipient the XML
instance remains fundamentally text and requires no single physical
instantiation of data for processing. This abstraction of the XML text from the
particular physical instantiation of data at either end of a transmission is
generally recognized as the reason an XML 'serialization syntax' is significant
progress in interprocess communication generally. What is not yet as widely
recognized is that the same abstraction divorces the XML instance from the
particular intent of its creator, at least for a recipient which does not make
the additional effort to re-introduce and apply that intent in a process which
could otherwise be more simply executed without it.


Walter Perry