OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Architectural Forms, A Summary

[ Lists Home | Date Index | Thread Index ]

Thanks, Leigh, for doing this.

"Leigh Dodds" <ldodds@ingenta.com> writes:

> * AFs offer a very limited transformation ability,
> see for detailed summary. Therefore they have a
> limited functional overlap with XSLT.

True enough.  My problem with this is that it's the
first thing on your list.  For me, its importance is so
low that it's really just a footnote.  (More about this
below.)

> * AFs promote co-operation between organisations who
> wish to share data by allowing each organisation to
> continue to manage it's own vocabulary. Agreement
> centres on an architectural (or 'meta') schema to
> which the individual vocabularies can be mapped. Each
> party retains sovereignty over their own syntax,
> while having a architectural format to validate other
> documents against. AFs are useful where there's an
> agreement on the essential core, but not on
> non-essential data, or naming. (Also useful
> internally, e.g.  variations across a company)

Well, this implies that the "meta" schema must be
designed as a "meta" schema.  That's not true.  Any
DTD, including any DTD that wasn't designed to be a
meta schema, can be used as a "meta" schema.
"Meta"-ness is a perspective, not an inherent property.

Of course, you can do some extra interesting things if
you design a schema for use as a "meta" schema to begin
with.  Things like "attribute-type architectural
forms", AKA "common attributes".  This is one of the
standard "AFDR" enhancements that were made to the DTD
notation.

Speaking of architectural processors, you should
mention them, and I don't think you did.  The ability
to use plug-in processors for specific inherited
architectures is *the* key economic reason to use AFs,
in my opinion.  (More below.)

> * Because each company provides the mapping from its
> vocabulary to the architectural form, the work is
> distributed amongst the co-operating parties. This
> limits the number of transformations that need to be
> managed by a single party.

Interesting perspective.  I've never thought of it as a
way to distribute *work*; I've always thought of it as
a way to distribute *sovereignty* over vocabularies.

> * Applications are designed to use the architectural
> form, and not the specific vocabularies. There is no
> need to manage local XSLT transforms as each instance
> document (+/- schema) defines it's own mapping.

OK, maybe this is where you're talking about modular,
plug-in, architecture-specific, semantic processors.  I
would be happier if you were more explicit about the
advantages of distributing the cost of such modules
among many applications, rather than implying that the
development of each application must shoulder the
entire burden of developing a semantic processing
system for each inherited architecture.

> * Having a common (architectural) format upon which
> to base processing is more flexible than trying to
> support multiple input formats (particularly when not
> all formats can be transformed into one another)

> * Attribute defaulting makes AFs very simple to use
> with DTDs

> * AFs are useful where Format A cannot be properly
> transformed into Format B, and also where only a
> subset of either Format A or B is required for a
> particular process.

I don't understand the relevance of the first clause
("useful where Format A cannot be properly transformed
into Format B").  The second clause is certainly
correct.

> * An individual schema may reference multiple
> architectures. This allows data to be re-used in
> multiple environments. The alternative is to produce
> data in multiple formats dependent on its expected
> use.

For me, the interesting thing about referencing
multiple architectures is that the semantic processing
logic associated with each referenced architecture can
be purchased and plugged into the application as a
re-usable software module.  Each referenced
architecture is money saved, and software
maturity/reliability gained.

> * While AFs can help facilitate co-operation, if
> there is already a single, or primary vocabulary then
> there is little additional benefit to be gained from
> applying them. They're needlessly 'meta'.

To have a single common DTD is perforce to have a
meta-DTD.  All you have to do is reference it as such.
It is not "needlessly meta".  It's already "meta"
whenever you decide to regard it as "meta".  You make
that decision whenever you start needing your first
local variation, and you still want to retain
interoperability with everyone else who is still using
the common DTD, or any AF-driven local variation
thereof.  Regardless of whether it's used as a
"meta"-DTD or as a plain-vanilla DTD, the common DTD
defines what is being interchanged.

> * A corollary to the above seems to be that if none
> of the parties attempting to co-operate already has
> an XML standard, then defining a single vocabulary
> seems to be a valid starting point.

Damn right!

> * AFs are also applicable for achieving reuse across
> horizontal vocabularies (and in this regard appear to
> directly overlap with the goals for XML
> Namespaces). For example linking semantics are fairly
> clear-cut, yet no-one seems keen to have to apply the
> same names to linking elements.

> * AFs can be used to map between schemas, but only if
> the schemas are designed for this, or are very
> similar.

Well, uh, I think it's much clearer and more accurate
to say:

* Either the two DTDs inherit a common architecture, or

* one of the DTDs inherits the other.

In other words, either they are both designed as
specializations of a common DTD, or one is designed as
a specialization of the other.

> * AFs are primarily a way to indicate that particular
> elements in different vocabularies share semantics,
> where the semantics being shared are very general
> (linking, inclusion, etc).

There is no generality requirement.  The shared
semantics can be just as specialized as anybody wants
them to be.  It's true that standards that use AFs,
such as the ISO/IEC 10744:1997 "HyTime" standard, tend
to be very generalized, but that's because of the
requirements for which their architectures are
designed.

> * Neither AFs nor XSLT are true general XML
> transformation languages.  XSLT offer many more
> transformation features that AFs, however
> transformation isn't the real aim of AFs. 'Mapping'
> might be a better way to put it.

Right.  But this whole "transformation" thing is a red
herring.  It was never the point of AFs.  The
transformation thing just a side-effect of the
methodology used to implement parsers that can support
the needs of re-usable, architecture-specific semantic
processing modules, such as "HyTime engines".

> * An advantage of AFs is that they can be implemented
> very simply, and work in a streaming mode (e.g. as a
> SAX Filter). XSLT cannot; however XSLT can also be
> used to implement architectural mapping, cf APEX

> * AFs can be used to implement I18N of
> vocabularies. Mapping element/attribute names to/from
> their original language.

That's interesting.  I didn't realize that.

> * AFs as originally specified are closely tied to
> DTDs and Processing Instruction based syntax. However
> they can be in isolation or in conjunction with
> another schema language. cf: AFNG

As far as XML is concerned, what you say is correct.
However, the *original* original syntax was based on
NOTATION attributes, a feature of SGML that was
unaccountably omitted from XML.  That's why the
alternative PI-based syntax had to be invented: XML
could not support the NOTATION attribute-based syntax.

> * Both Namespaces and AFs are used to associate
> semantics. Namespaces say "this is an element from
> the X namespace (e.g. XHTML) and should be processed
> as such". AFs say "this element is directly
> equivalent to element Y in architecture B, and should
> be processed as such". With the caveat that
> "processed as such" doesn't necessarily require
> global agreement, but does require local consistency.

I don't understand the words, "doesn't necessarily
require global agreement".  Semantically, global
agreement is required.  Syntactically, global agreement
about names (GIs and attribute names, and the question
of whether certain things are expressed as element
contents vs. attribute values) is not required.

> * Using RDDL, or similar, Namespaces can be made to
> point directly to a description of these semantics
> (thin ice here). No such mechanism for AFs, or rather
> original mechanism used PubId, but there's no
> standard documentation.

Not true.  The AF paradigm requires an "Architecture
Definition Document (ADD)", and both of the syntaxes
for declaring base architectures provide places for
pointers to ADDs.  There are also separate places for
pointing to the DTD.

> * A key premise of AFs is that the GI is only one
> property of the element that could be used to direct
> processing. An (architectural) attribute is an
> equally valid dispatch mechanism. This view allows an
> element to have multiple types (i.e. be mapped to
> elements in multiple architectures).  This is in some
> way counter to XML/Namespaces where the GI is the
> type of the element. The mid-ground seems to be that
> the GI defines the primary relationship, and that one
> concedes that (other) attributes can be used to
> dispatch processing. (cf: role attribute pattern).

Cf: XLink.  As far as XLink is concerned, the meta-GIs
are provided by certain standard attributes.  I think
you're trying to make a distinction without a
difference, here.  If the actual GI doesn't *also*
dispatch (or at least affect) at least *some*
processing in some way, what's the point of uttering it
at all?

> The real message here is that when we exchange data
> we agree on how to process it. Using element names is
> one way. Keying of attributes is another, and also
> allows me to have separate agreements with another
> party, but use the same data.

I think the real message of AFs is that with them we
can verify whether a document that conforms to a local
variation of a "standard" (agreed-upon, community-wide)
DTD also, when understood in terms of the mapping that
the instance itself describes, conforms to the
syntactic constraints imposed on instances of the
"standard" DTD.  This is what makes it possible,
economically speaking, to enjoy the advantages of
re-usable, modular, architecture-specific semantic
processing engines.  The basic point is that, when
information is not properly understood on arrival, the
finger of blame can be pointed at the non-conforming
party.  Was the document at fault, or the system that
tried to understand it?

The AF paradigm may, in some cases, also allow the same
data to be processed by different architecture-specific
engines.  This is where "multiple inheritance" for
individual elements is important.  The effect is to
allow, in some cases, a single dataset to describe
itself in such a way as to allow it to be understood in
multiple different vendor-specific processing contexts.

Personally, I would like to see the AF idea developed
in such a way as to make it an even more powerful
data-self-description paradigm, so that there would be
no limits on the ability of data to describe its own
transformations for use in various processing contexts.
The result would be that information owners could serve
many markets with a single product, and everyone would
be able to tell who was responsible for any failure to
interchange information.  AFs, as we know them, aren't
quite up to that particular challenge.  Yet.

-- Steve

Steven R. Newcomb, Consultant
srn@coolheads.com

voice: +1 972 359 8160
fax:   +1 972 359 0270

1527 Northaven Drive
Allen, Texas 75002-1648 USA





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS