[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Enlightenment via avoiding the T-word

From: Tim Bray <tbray@textuality.com>
To: xml-dev@lists.xml.org
Date: Sat, 25 Aug 2001 18:30:46 -0700

In reviewing this thread, it seems to me that there is a certain
word beginning with T the occurrence of which is strongly 
correlated with exuberant bursts of extrapolation and exegesis.

So let me present an alternate reality which I claim to be
internally consistent, consistent with all interesting real-world
software implementations, and free of voodoo or magic:

XML provides a method for textually encoding data objects;
the encoding allows the components of the data objects (whether
provided as elements or attributes) to be given labels.  Let's
call these labels "labels".  The label syntax is given by the 
"Name" production from the XML recommendation.

XML Namespaces allow the labels to be extended by the addition
of a URI reference.  Let's call these extended labels "ulabels".
Because of the defaulting mechanism, ulabels cannot be 
distinguished syntactically from labels without processing their 
context.

A large variety of software programs (and humans) process XML.
In selecting which components to process, and what processing to
do, they are observed to use a wide variety of input information.
Most obvious is the label (or ulabel if provided), but other
relevant information can include
 - the context of the component in the XML document
 - whether it has an attribute with a particular label
 - the value of an attribute with a particular label
 - some external operation based on the content of the 
   component; i.e. treat it as a part number, look up the
   inventory in a database, and process it only if the
   count on hand is below a critical value.
 - entirely external information such as the time of day or
   the identity of the software's user

One class of software application is called "validation", 
which consists in determining whether one or more components 
in an XML document (or the entire document) meet the 
constraints described in a declarative specification usually 
called a "schema".

The original XML recommendation included the specification of 
a constraint language.  This has supported the mistaken belief 
that validation is uniquely special and important among all 
the classes of applications which process XML.

Historically, the only validation available was based on 
the DTD (an acronym we can't expand here).  This ties 
constraints to elements *only* on the basis of their label,
and to attributes based on the combination of their label 
and that of the element to which they are attached.  This 
limitation, and an unfortunate choice of terminology in the 
XML recommendation, has supported the mistaken belief that 
labels are mystically tied one-to-one to validation 
constraints and other semantics.  DTD validation has no 
support for the use of ulabels.

Modern validation facilities such as XSDL, Schematron, and 
Relax allow the tying of constraints to components in a much 
more flexible way, including element context.  They also 
provide good support for constraining combinations of 
components with labels, ulabels, or a mixture of the two.

There is an open debate as to whether or not, in newly
defined vocabularies, ulabels should be specified for 
all the elements.  Those who feel that they should be are
irritated by XSDL's apparent bias in the other direction.

At the end of the day, labels are just labels.  They are
one of the stepping stones from content to semantics, but
only one.  

If anyone wants to take issue with this, please try to do
so without using the T-word. -Tim

Follow-Ups:
- Re: Enlightenment via avoiding the T-word
  - From: "Simon St.Laurent" <simonstl@simonstl.com>
- Re: Enlightenment via avoiding the T-word
  - From: Paul Prescod <paulp@ActiveState.com>

Prev by Date: Re: Doing large scale XML processing/transformation?
Next by Date: Re: Enlightenment via avoiding the T-word
Previous by thread: Re: Doing large scale XML processing/transformation?
Next by thread: Re: Enlightenment via avoiding the T-word
Index(es):
- Date
- Thread