[
Lists Home |
Date Index |
Thread Index
]
Rich Salz wrote:
>Dave Orchard just posted a very detailed article compatible
>evolution to a (w3c xml/xsd) schema. I haven't read it all yet, but it
>looks very useful:n
>http://www.pacificspirit.com/Authoring/Compatibility/ProvidingCompatibleSchemaEvolution.html
>
>
>
I admire Rich's patience: after all these years of DTD experience and
XML Schema development, the best that WXS offers is no
improvement over sticking empty parameter entities in content models
of DTDs, for some common cases?
Actually, XML Schemas is a slight step backwards in *some* kinds of
versionability
over DTDs, because it does not allow INCLUDE/IGNORE marked sections.
Schematron's <phase> mechanism and Bob Ducharme's "stages"
(and EvdV's xvif) are both attempts to provide higher level versions
of INCLUDE/IGNORE. Dave is (as usual) acute to point out
the differences between versioning and extensibility, but there
are other kinds of versioning too: staged, progressove validation
of documents-in-progress in particular.
The usefulness of open content models is old hat (Roger Costello's
hat, for one). I don't want to appear critical without having anything
alternative to offfer. Drum roll. Here is one way to do it in Schematron,
treating it as a problem of separating constraints into those of
genus (open, extensible) and those of species (closed).
First here is the basic Schema:
<schema xmlns="http://www.ascc.net/xml/schematron">
<title>Basic schema</title>
<p>This schema validates the basic documents as
in David Orchid's useful "Provide Compatible Schema Evolution"
http://www.pacificspirit.com/Authoring/Compatibility/ProvidingCompatibleSchemaEvolution.html
<p>
<phase name="Basic">
<active pattern="BasicGenus" />
<active pattern="BasicSpecies" />
</phase>
<phase name="BasicOpen">
<active pattern="BasicGenus"/>
</phase>
<pattern id="BasicGenus">
<rule context="name">
<assert test="count(first)=1">A name should have a single first
name</assert>
<assert test="count(last)=1">A name should have a single last
name</assert>
<assert test="first/following-sibling::last">The first name
should come before the
last name</assert>
</rule>
</pattern>
<pattern id="BasicSpecies">
<rule context="name">
<assert test="count(*)=2">A name can only have a first and last
element</assert>
</rule>
</pattern>
</schema>
In the phase "BasicClosed" this schema is closed; in the phase "BasicOpen"
this schema is open: you can extend it by validating with another schema
in parallel, or by adding new patterns:
<schema xmlns="http://www.ascc.net/xml/schematron">
<title>Extended schema</title>
<p>This schema validates the extended documents as
in David Orchid's useful "Provide Compatible Schema Evolution"
http://www.pacificspirit.com/Authoring/Compatibility/ProvidingCompatibleSchemaEvolution.html
<p>
<phase name="Basic">
<active pattern="BasicGenus" />
<active pattern="BasicSpecies" />
</phase>
<phase name="Extended">
<active pattern="BasicGenust" />
<active pattern="ExtendedGenus" />
<active pattern="ExtendedSpecies" />
</phase>
<phase name="BasicOpen">
<active pattern="BasicGenus"/>
</phase>
<phase name="ExtendedOpen">
<active pattern="BasicGenus"/>
<active pattern="ExtendedGenus" />
</phase>
<pattern id="BasicGenus">
<rule context="name">
<assert test="count(first)=1">A name should have a single first
name</assert>
<assert test="count(last)=1">A name should have a single last
name</assert>
<assert test="first/following-sibling::last ">The first name
should come before the
last name</assert>
</rule>
</pattern>
<pattern id="BasicSpecies">
<rule context="name">
<assert test="count(*)=2">A name can only have a first and last
element</assert>
</rule>
</pattern>
<pattern id="ExtendedGenus">
<rule context="name">
<assert test="count(middle) <= 1">A name may have a single
middle name</assert>
<assert test="not(midddle) or first/following-sibling::middle "
>The first name should come before the middle</assert>
<assert test="not(midddle) or middle/following-sibling::last"
>The middle name should come before the middle</assert>
</rule>
</pattern>
<pattern id="ExtendedSpecies">
<rule context="name">
<assert test="count(*)=count(first) + count(middle) + count(last)"
>A name can only have a first, middle and last element</assert>
</rule>
</pattern>
</schema>
So in one schema we have (by invoking the schema in a different phase):
* a closed basic schema
* an extensible (open content model) basic schema
* a closed new version of the schema with extra content in the middle
* an extensible (open content model) extended with extra content in
the middle
Versioning + extensibility.
The trouble with Schematron (to some people) is that some schema that
(some people expect) should be simple can be in fact composed of many
constraints: the cost is (to them) verbosity. But the above example
shows the other
side to the coin; grammars can conflate several different contsraints
(position and occurance) which prevent extensions: grammars sometimes
overconstrain you. The grammar constraints that a <name> contains
first a <first> then last a <last> only, are such overconstraints: all that
is needed is that a <name> contains a <first> and a <last> in that
order. Schematron can help extensibility by letting you tease out
the different constraints; for example, in the above schemas we
separate out the content constraints we want on all documents (genus?)
from the particular constraint we want on particular closed schemas
(species?)
One kind of versioning this does not cope with is where you want your
new schema to have a new namespace, but to validate documents in the
old namespace. I don't think namespaces can thrive without it. One way
would be for all languages that recognize namespaces (XSLT, schemas, etc)
to allow a simple * wildcard, which could be used for numbers. Rather
than needing complex schema mechanisms to cope with versioning,
part of the cost should be born by processing systems: that would
allow you to have some kind of minor/major numbering of your schemas
in the namespace.
Cheers
Rick Jelliffe
|