Re: [xml-dev]Changing Namespaces Between Specification Versions

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: rjelliffe@allette.com.au
To: xml-dev@lists.xml.org
Date: Fri, 24 Apr 2009 13:51:46 +1000 (EST)

Fraser wrote:

> In many circumstances both XSD itself
> and validation against XSD is just too brittle.

For what it is worth, here is my two cents concerning schemas for
long-lived or open or multi-party or public data:

* Namespaces URI should never identify versions of schemas, but general
semantic area and controlling authority. They should be enough to
determine which class of application or plugin is appropriate. THEREFORE
software applications need to use more than the namespace to identify the
schema and software. This analogous to CODECs for media applications: the
user see JPEG or MPEG etc, but that just invokes a system that sniffs the
data and selects (or downloads) the appropriate CODEC.

* All schemas and documents should have separate version numbers, which
feature minor and major numbers for the schema: if the old schema can be
derived by restriction from the new schema, it is a minor version,
otherwise it is a major version. THEREFORE all documents that were valid
against a schema with a lesser minor number will be valid against the
schema with a new minor number (given the same major number.)

* Documents should use the lowest major and minor number that corresponds
to the features actually used in the document. THEREFORE no document will
be unneccessarily rejected where the receiver was an older system.

* Where the application space has competitive standards, or the documents
are compound, or where the standards are not stabilized, or where there
are plain and fancy alternatives, or where there is churn and evolution,
the IS29500 Part 3 Markup Compatibility and Extension (MCE) mechanism
should be adopted. This allows alternative sections, marked up by
namespace URL, with a must-understand mechanism; for any public material,
at least one choice in a plain form should be used (e.g. as well as
supporting SVG, support JPEG alternatives.) THEREFORE the receiver can
choose the optimal client.

* Where there are multiple versions of a standard, there should be a
Schematron schema made which can report unambiguously which version was
used. In other words, the evolution of the schema should be schematified.
THEREFORE a client can base its decision to process based on relevant
changes in the schema, i.e. on elements actually found, not irrelevant
ones: the schema may have changes in ways that are irrelevant to the
application.

* Use Schematron patterns to abstract out and model commonality between
major numbers and the variations for minor numbers.

* Use Schematron phases to abstract out and model the evolution of the
schema through major and minor numbers. Terminal applications in a
processing graph should select the phases or patterns to validate incoming
data on based only on needs and ignore irrelevant constraints.

Part of the problem of versioning is that the standard schema languages
were made with little practical thought about versioning.

SGML DTDs had a workable marked section system that provided a measure of
support for modeling variants in the same schema document, but not as
first-class objects. XSD has its notions of type derivation, but they are
fragile and weak (derivation by extension!) and interact poorly with other
parts of the language (UPA, etc). DSDL has a story (DSRL for token
changes, NVDL for modularity, XProc for smarts like versioning) but not a
reality yet (SProc is only just out of the oven.)

Only Schematron has first-class objects (patterns, phases) with enough
power to model schema evolution. And even in Schematron the schema
probably needs to be written with the intent that the schema can be
further evolved.

The problem is not schema languages: it is the unprofessionalism of schema
professionals, if I may be uncomfortably frank. We make schemas without
serious thought to maintenance; and because of this we choose schema
languages which do not have any serious support for evolution (and, even
if we do use Schematron, we don't organize them so that systems using them
will be written to cope with change.)

Let me be more challenging: if you (the schema developer) adopt a schema
language which has no real support for evolution, then of course you will
eventually have trouble: you have dug the hole yourself. Now perhaps I
might be accused of blaming the victim, but if you choose to adopt an
inadequate schema language, which has had this inadequacy publicized for
years, you cannot consider yourself a victim of the inadequacies of XSD
etc. They are what they are, and you don't have to adopt them!

Cheers
Rick Jelliffe

References:
- Re: [xml-dev]Changing Namespaces Between Specification Versions (was: XIN: XML implicit namespace definitions)
  - From: "Pete Cordell" <petexmldev@codalogic.com>
- RE: [xml-dev]Changing Namespaces Between Specification Versions (was: XIN: XML implicit namespace definitions)
  - From: "Michael Kay" <mike@saxonica.com>
- Re: [xml-dev]Changing Namespaces Between Specification Versions (was: XIN: XML implicit namespace definitions)
  - From: Andrew Welch <andrew.j.welch@gmail.com>
- Re: [xml-dev]Changing Namespaces Between Specification Versions (was: XIN: XML implicit namespace definitions)
  - From: "Pete Cordell" <petexmldev@codalogic.com>
- Re: [xml-dev]Changing Namespaces Between Specification Versions
  - From: Chuck Bearden <cbearden@rice.edu>
- [xml-dev]Changing Namespaces Between Specification Versions
  - From: Webb Roberts <strebor@gmail.com>
- Re: [xml-dev]Changing Namespaces Between Specification Versions
  - From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
- Re: [xml-dev]Changing Namespaces Between Specification Versions
  - From: Andrew Welch <andrew.j.welch@gmail.com>
- Re: [xml-dev]Changing Namespaces Between Specification Versions
  - From: Fraser Goffin <goffinf@googlemail.com>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]