OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Enlightenment via avoiding the T-word

> From: Don Park [mailto:donpark@docuverse.com]


> What we need is a XML processing framework that supports 
> molecules and atoms
> design pattern.  What we don't need is more complications and fancy
> if-but-when-must at what I consider to be lexical slum level. 
>  *argh* I am
> starting to sound like Len.

I'd go even further. We need an XML processing framework that accepts the
plurality of application domains without prejudice or favoritism. We need an
XML processing framework that does not take a specific metadata vocabulary
for annotating information items and a particular set of transformations and
bless them and insist that everyone accept the notion that these are not
annotations and transformations, but rather the process of realizing an XML
instance's True Form. The only true form of an instance is that of the
instance itself, and that's nothing but a bunch of text and pointy brackets.
Everything else is layered atop that to suit a particular application domain
or processing model.

I'll put forth a different formulation of some thinking I tried to express
yesterday. I feel like I've had an epiphany (though that feeling may
evaporate once someone rhetorically rips me to shreds).

It is very typical for an XML application to want to associate certain
metadata with XML information items to suit certain processing needs. One
can easily envision different metadata vocabularies to suit different
domains. None of these are inherent in the instance itself. None of the
processing done with the instance and associated metadata is a realization
of the instance's true form. The only true form of the document is that
which is in the instance itself, and that's just a bunch of text and pointy

One particular class of application is that which we call a validator. The
metadata a validator wishes to associate with an information item is a
grammar or set of rules that express a set of constraints. Validators verify
that a document satisfies the collective set of constraints associated with
its information items. If the document satisfies the constraints, the
validator passes it on to another application for further processing;
otherwise, the document is rejected. There is no such thing as well-formed
but inherently invalid XML document. It is only invalid within the context
of a particular domain, and the constraints suited to that domain can be
expressed in a schema.

Another class of application is transformers. These produce a different
information set better suited for processing within a particular application
domain. At one extreme, we have those that associate XSLT templates with
elements, and use these to transform a document into something potentially
quite different. At another extreme we have those that do very simple
transformations, such as adding default attributes. Then there are many
shades of gray between these extremes, such as Simon's namespace-mapping SAX
filters. There is no such thing as a wrong transformation, except one that
produces an unintended result.

Other applications may wish to annotate information items with additional
labels that provide hints to applications for further processing. For
instance, one may want to attach a label to elements "shippingAddress" and
"billingAddress" that indicate both of these represent addresses and should
be processed as such by an application. We should not be enshrining one
metadata vocabulary in a PSVI and insisting that that one vocabulary and no
other is intrinsic to the true form of document instances.

With the status quo, however, validators are accorded special status. They
are not like other applications. The processing they do is regarded as
something very fundamental and inviolate. In addition, validators are
allowed to add certain annotations from a specific blessed metadata
vocabulary (XML Schema), and they are allowed to perform certain specific
blessed transformations. These blessed transformations and annotations are
considered part of the true form of the instance, enshrined in the PSVI, and
annotations or transformations that are not blessed are derided as
desecrations of the infoset. The lines drawn strike me as rather arbitrary.

If we reject the PSVI, and if we accept there are no wrong transformations
(except those that produce an unintended result), and if we accept that
there are potentially many metadata vocabularies suited to different
application domains and many valid processing models, and that none of these
are somehow intrinsic to a document instance, then it seems to me that much
of the fodder for argument simply evaporates. One can even imagine a more
flexible schema mechanism that can be invoked in different modes like an
XSLT stylesheet. One could invoke it in "elementFormQualified" mode, or
"elementFormUnqualified" mode; the application gets the information in the
form most suited to its needs, and we dispense with religious debate over
which is its true form.

In hindsight, now, I have to agree with Mike Champion's post about
scholasticism[1]. This debate smacks of scholasticism because it centers so
much on the debate about the true form of an XML instance. I think the XML
world needs its renaissance, and we must start by dethroning the PSVI.

[1] http://lists.xml.org/archives/xml-dev/200108/msg01020.html