OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Schema evolution as Genus and Species (Re: [xml-dev] Schema evolution)

[ Lists Home | Date Index | Thread Index ]

Rich Salz wrote:

>Dave Orchard just posted a very detailed article compatible
>evolution to a (w3c xml/xsd) schema.  I haven't read it all yet, but it
>looks very useful:n
>http://www.pacificspirit.com/Authoring/Compatibility/ProvidingCompatibleSchemaEvolution.html
> 
>  
>
I admire Rich's patience: after all these years of DTD experience and
XML Schema development, the best that WXS offers is no
improvement over sticking empty parameter entities in content models
of DTDs, for some common cases?

Actually, XML Schemas is a slight step backwards in *some* kinds of 
versionability
over DTDs, because it does not allow INCLUDE/IGNORE marked sections.
Schematron's <phase> mechanism and Bob Ducharme's "stages"
(and EvdV's xvif) are both attempts to provide higher level versions
of INCLUDE/IGNORE.  Dave is (as usual) acute to point out
the differences between versioning and extensibility, but there
are other kinds of versioning too:  staged, progressove validation
of documents-in-progress in particular.

The usefulness of open content models is old hat (Roger Costello's
hat, for one). I don't want to appear critical without having anything
alternative to offfer. Drum roll. Here is one way to do it in Schematron,
treating it as a problem of separating constraints into those of
genus (open, extensible) and those of species (closed).

First here is the basic Schema:

<schema xmlns="http://www.ascc.net/xml/schematron";>
    <title>Basic schema</title>
    <p>This schema validates the basic documents as
   in David Orchid's useful "Provide Compatible Schema Evolution"
   
http://www.pacificspirit.com/Authoring/Compatibility/ProvidingCompatibleSchemaEvolution.html
    <p>

<phase name="Basic">
    <active pattern="BasicGenus" />
    <active pattern="BasicSpecies" />
</phase>

<phase name="BasicOpen">
    <active pattern="BasicGenus"/>
</phase>

<pattern id="BasicGenus">
   <rule context="name">
       <assert test="count(first)=1">A name should have a single first 
name</assert>
       <assert test="count(last)=1">A name should have a single last 
name</assert>
       <assert test="first/following-sibling::last">The first name 
should come before the
            last name</assert>
   </rule>
</pattern>

<pattern id="BasicSpecies">
    <rule context="name">
       <assert test="count(*)=2">A name can only have a first and last 
element</assert>
    </rule>
</pattern>
</schema>

In the phase "BasicClosed" this schema is closed; in the phase "BasicOpen"
this schema is open: you can extend it by validating with another schema
in parallel, or by adding new patterns:

<schema xmlns="http://www.ascc.net/xml/schematron";>
    <title>Extended schema</title>
    <p>This schema validates the extended documents as
   in David Orchid's useful "Provide Compatible Schema Evolution"
   
http://www.pacificspirit.com/Authoring/Compatibility/ProvidingCompatibleSchemaEvolution.html
    <p>

<phase name="Basic">
    <active pattern="BasicGenus" />
    <active pattern="BasicSpecies" />
</phase>

<phase name="Extended">
    <active pattern="BasicGenust" />
    <active pattern="ExtendedGenus" />
    <active pattern="ExtendedSpecies" />
</phase>

<phase name="BasicOpen">
    <active pattern="BasicGenus"/>
</phase>

<phase name="ExtendedOpen">
    <active pattern="BasicGenus"/>
    <active pattern="ExtendedGenus" />
</phase>

<pattern id="BasicGenus">
   <rule context="name">
       <assert test="count(first)=1">A name should have a single first 
name</assert>
       <assert test="count(last)=1">A name should have a single last 
name</assert>
       <assert test="first/following-sibling::last ">The first name 
should come before the
            last name</assert>
   </rule>
</pattern>

<pattern id="BasicSpecies">
    <rule context="name">
       <assert test="count(*)=2">A name can only have a first and last 
element</assert>
    </rule>
</pattern>

<pattern id="ExtendedGenus">
    <rule context="name">
       <assert test="count(middle) &lt;= 1">A name may have a single 
middle name</assert>
       <assert test="not(midddle) or first/following-sibling::middle "
            >The first name should come before the middle</assert>
       <assert test="not(midddle) or middle/following-sibling::last"
            >The middle name should come before the middle</assert>
    </rule>
</pattern>

<pattern id="ExtendedSpecies">
    <rule context="name">
       <assert test="count(*)=count(first) + count(middle) + count(last)"
            >A name can only have a first, middle and last element</assert>
    </rule>
</pattern>

</schema>

So in one schema we have (by invoking the schema in a different phase):
  * a closed basic schema
  * an extensible (open content model) basic schema
  * a closed new version of the schema with extra content in the middle
  * an extensible (open content model) extended with extra content in 
the middle

Versioning + extensibility.

The trouble with Schematron (to some people) is that some schema that
(some people expect) should be simple can be in fact composed of many
constraints: the cost is (to them) verbosity. But the above example 
shows the other
side to the coin; grammars can conflate several different contsraints
(position and occurance) which prevent extensions: grammars sometimes
overconstrain you. The grammar constraints that a <name> contains
first a <first> then last a <last> only, are such overconstraints: all that
is needed is that a <name> contains a <first> and a <last> in that
order. Schematron can help extensibility by letting you tease out
the different constraints; for example, in the above schemas we
separate out the content constraints we want on all documents (genus?)
from the particular constraint we want on particular closed schemas
(species?)

One kind of versioning this does not cope with is where you want your
new schema to have a new namespace, but to validate documents in the
old namespace.  I don't think namespaces can thrive without it. One way
would be for all languages that recognize namespaces (XSLT, schemas, etc)
to allow a simple * wildcard, which could be used for numbers.  Rather
than needing complex schema mechanisms to cope with versioning,
part of the cost should be born by processing systems: that would
allow you to have some kind of minor/major numbering of your schemas
in the namespace.

Cheers
Rick Jelliffe





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS