This assumes one would like as consistent an approach as possible between the two cases for XML schema changes cited by Curt Arnold:
Case 1. The new schema changes the interpretation of some element.
For example, a construct that was valid and meaningful for the previous
schema does not validate against the new schema.
Case 2. The new schema extends the namespace by adding new elements,
etc, but does not invalidate previously valid documents.
Versioning Approaches:
1. Changing the (internal) schema version attribute.
Pros:
- Easy. Part of schema specification.
- Useful when all documents from the old version of the schema are
valid with the new schema (case 2 above).
- Applications are robust with this approach. An application
could reject the new version of the schema (if appropriate).
- Could do a pre-parse and choose the schema based on the version number.
Tony Coates suggests one way to do this is to have a RDDL document at the
namespace URI, and in that document have the URIs for each schema version.
Cons:
- Ignored by validator.
- This alone is impractical since two files with the same name can’t
be in the same location. Therefore, one would also need to change
the schema filename and the schemaLocation attribute in the instance document.
1.5 Put a schemaVersion attribute on the element that introduces the namespace.
The example provided by Curt Arnold follows. For example, if a document was valid per version 1.0 and later of http://www.example.org/foo, it could indicate it by:
<foo:foo foo:version="1.0" xmlns:foo="http://www.example.org/foo"/>
If a document relied on elements that were defined in later versions, it could indicate it by:
<foo:foo foo:version="1.5" xmlns:foo="http://www.example.org/foo"/>
(acceptible values of foo:version in the v1.5 schema would be "1.5", "1.4", "1.3", etc)
If existing processors saw this document, they could either validate against the foo v1 schema and reject the document, or they could validate against a lax v1 compatible schema (that had a decent amount of <xsd:any>'s) or they could skip validation.
This approach does require the schema resolution mechanism to allow access to the attributes on the namespace introducing element.
Pros:
- Useful when all documents from the old version of the schema are
valid with the new schema (case 2 above).
- Applications are robust with this approach. An application
could reject the new version of the schema (if appropriate).
- Allows concurrent use of different versions of a schema.
Cons:
2. Changing the schema's targetNamespace.
Pros:
- Applications are robust with this approach. An application
would not recognize the new namespace.
- Good that instance documents and schemas that include the relevant
schema must change to reference the new version because one would want
to assure that there are no compatibility problems.
- Allows concurrent use of different versions of a schema.
Cons:
- With this approach, instance documents will not validate until they
are changed to designate the new targetNamepsace. (Some say this
is a pro.) However, one does not want to force all instance documents
to change, even if the change to the schema is really minor and would not
impact an instance.
- Any schemas that ‘include’ this schema would have to change because
the target namespace of the included components must be the same as the
target namespace of the including schema. (Again some would say this
is an advantage.)
3/4. Changing the name/location of the schema
Pros:
Cons:
- As with option 2, one disadvantage of this approach is that it forces
all instance documents to change, even if the change to the schema would
not impact that instance.
- Any schemas that import the modified schema would have to change
since the import statement provides the name and location of the imported
schema.
- Applications are not robust under this approach since the application
receives no hint that the meaning of various element/attribute names has
changed.
- The schemaLocation attribute in the instance document is optional
and is not authoritative even if it is present. It is a hint
to help the processor to locate the schema. Therefore, relying on
this attribute is not a good practice (with the current reading of the
specification).
XML SCHEMA VERSIONING BEST PRACTICES:
1. Make previous versions of an XML schema available
This allows applications to use previous versions. It also allows users to migrate to new versions of the schema as compatibility is assured.
2. When an XML schema is only extended, (e.g., new elements, attributes, extensions to an enumerated list, etc.) one should strive to not break the receiving application where practical.
For example, if one is adding new elements or attributes, one could consider making them optional where this makes sense.
In this case, changes in the version number (option #1) or schemaVersion attribute (option 1.5), with a recorded change history, should suffice. One may want to change the schema file name as well.
3. Where the new schema changes the interpretation of some element (e.g., a construct that was valid and meaningful for the previous schema does not validate against the new schema), one should change the target namespace.
In the schema:
a. Change the target namespace
b. Designate the new schema version (either via option 1 or 1.5)
c. Change schema filename?
Comments?
Any opinions on option 1 vs 1.5?
Thanks,
Mary
Mary Pulvermacher wrote:
Hello everyone-Roger Costello has asked me to initiate this Best Practice topic. The results of this discussion will be posted, along with the other Best Practices, on the Best Practice Homepage (http://www.xfront.com/BestPracticesHomepage.html).
Topic: What is the Best Practice for versioning XML schemas?
Is it better to version a schema by:
1. Changing the (internal) schema version attribute,...
2. Changing the schema's targetNamespace,
3. Changing the name of the schema, or
4. Changing the location of the schema?