OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xml-dev] Re: determining ID-ness in XML

"Simon St.Laurent" wrote:

> On Wed, 2001-11-21 at 08:33, Sean McGrath wrote:
> > At 18:27 21/11/2001 +1100, Rick Jelliffe wrote:
> > >
> > >So then we get
> > >   XML
> > >     +/- validation
> > >     +/- PSVI  contributions
> > >     +/- namespace
> > >     +/- Blubbery
> > >     +/- ids
> > >     +/- xml:base
> > >     +/- xml:include
> > >     +/- xlink embed
> > >
> > >= 256 flavours of XML systems even without addressing entity defaulting.
> > >
> > >Thin uncordinated layers are the recipe for non-interoperability.
> >
> > Amen!
> And more Amen!
> (Finally, a choir I can join!)

And again Amen. But now that we have seen the light of revelation, what are we
going to do about it? [Shameless plug: My talk at XML2001 in three weeks
addresses this subject--http://www.xml.com/pub/e_sess/65.]

As I see it, the possibilities are:

    1) Build tightly integrated, fully-featured, monolithic XML processors.
Strictly speaking, this is the converse of 'thin uncoordinated layers'. These
processors would, of course, have to be updated for every change of every
spec--even specs which people might assume were incorporated by reference, as
we learned in the Blueberry debate. This approach would ignore what I believe
are generally conceded to be the most important lessons of the past twenty
years of software design. And as a practical matter, these monolithic
processors wouldn't address anyone's specific business practices or particular
processing needs:  their output, at best, would be some canonical infoset, to
which every real-world business-domain application would then have to be
written. Granted, this approach has worked:  SAX does, after all, specify an
essentially equivalent approach, but at (at least!) two orders less complexity.
Furthermore, from the earliest days of SAX there was the expectation of
pipeline processing, routing the output of a simple parser through a sequence
of domain-specific filters and processors, rather than building each new
complexity back into the original parser/processor.

    2) Establish a W3C (or other standards/specification body) processor
integration activity. If successful, this yields a result equivalent to (1),
but realized in two steps--the specification, followed by the (presumably,
after all this work, interoperable) implementations--rather than the de facto
standardization on one vendor, or another's, particular monolithic processor.
There is certainly a solid argument that it is better to have multiple
interoperable implementations of these monolithic processors, as compared to
vendor-proprietary offerings. Yet the processors will suffer in any case from
the same fundamental weaknesses of over-complexity and from the inherent
instability of being subject to replacement to accommodate every new
development in every one of their constituent specifications.

    3) Grant the chief power of process regulation in each business domain to
the specifiers of a vertical market transactional data vocabulary promulgated
for each identified field of activity. There are business domains in which this
is already happening, including at least a few where the domain itself has been
created, or at least integrally defined for the first time, by that vertical
market vocabulary. The problem here is a subtler form of the same failure as in
(1) to address the specifics of any one participant's business. What is
standardized in the vertical market vocabulary is the common denominator of a
business domain, but I have rarely, if ever, seen an enterprise which would
describe itself as nothing more than the common denominator of its industry,
nor could survive without more specific competencies to differentiate itself
from its competitors. The vertical market vocabulary standard is the pure
recipe for a cartel, but given that each participant's expertise and particular
concentration will vary from every other's, the common vocabulary quickly
becomes the basis of *interchange* rather than of *integration*. That takes us
back where we came in. Instead of integrated processes, we have inherent and
ongoing tension between the imperative for each enterprise to substitute or
supplement its own thin layer for some portion of the integrated, orthodox
whole. In practice, this comes to mean that at their gateways individual
enterprises will transform in and out of the expected data form in order to
accommodate strictly local processing demands. At that point, we no longer have
integration based on standard processing, but merely interchange based on a
transformation which presents an expected data structure as if the expected
processing were still behind it. Over time, that transformation will grow more
difficult--in some cases, impossible--because of how far the local processing
has diverged from orthodox expectations. Eventually enterprises will be tempted
to open private exchanges with others whose similar processing needs have led
them to diverge from the orthodoxy in similar ways. And at that point, we no
longer have even interchange:  multiple local variations of process result in
varying output data structures with no single accepted way to integrate them,
let alone to agree on a processing standard which would yield that integrated

    4) Accept from the outset that processing--and therefore the processing
model, its order of operations, and the particular form of data input which it
expects and the 'natural' data structure which is its output--are inherently
matters of local expertise and local need at each processing node. Let those
who promulgate general specifications define the thinnest possible layers
precisely to have the greatest possibility of having each of them accepted for
the greatest range of divergent uses by the widest variety of processors. This
means that the form of integration and the specific interdependencies of
various specs will inevitably vary by implementation, precisely so that they
can be arranged to produce the results required by each different processor.
The ultimate implication is that no upstream creator of data may impose its
intent on a downstream processor. Regardless of which of the 256 (or 2048, or .
. .) variants of processing that an XML documents may have been created for,
the point of each processor's individual expertise is that the same variety of
possibilities is open to every subsequent processor of that document, in order
to achieve its particular goals.


Walter Perry