[
Lists Home |
Date Index |
Thread Index
]
For a while I have been continuing a thread which started out thinking about
versioning of XML schema types, in particular enums. The debate broadened
and a variety of helpful and interesting views were voiced about versioning
in general and as a related subject extensibility. Personally I have been
relating these comments to XML schema structures but I could have easily
been talking about the service interface supported by those schema. This has
highlighted some different opinions about the value of various approaches to
this problem which I hope have resonated with those following the thread.
I have become quite interested in the UBL work that Ken Holman has
introduced and the position UBL is taking about the separation of the
validation of structural conformance versus value based.
I guess the thing that I am still mostly undecided about is to do with
whether to allow for schema extensibility (using xs:any together with the
'sentry' approach proposed by David Orchard (and others) or whether this is
a recipe for an uncontrollable vocabulary.
I think the battle-ground is in part characterised by a schema (or service)
that, once published is considered as immutable, hence any changes REQUIRE a
NEW VERSION with a NEW NAMESPACE, versus a schema which allows non breaking
changes to be introduced by both the schema owner and non schema authors and
supports both forward and backwards compatibility.
The first situation is a 'clean' and explicit model where the semantics are
guaranteed not to be usurped by a non schema owner but where even relatively
minor change requirements can have a large impact to implementations
(especially when there are a large number of external users of this
vocabulary). Changes often take a relatively long while to surface through
into the standard and this may impact business priorities. Versioning is
enabled through support for one or more of the available schema where, from
time to time, old versions may be deprecated.
The schema extensibility approach promotes the idea that organisations may
want to represent private relationships using data carried at specified
points within the standard schema in such a way that that data is only
relevant between those parties (using a foreign namepsace) and all others
can safely ignore it (and that the schema author should not necessarily
attempt to constrain this type of usage). It recognises that the pace of
change to a standard schema often lags behind the operational requirements
of user organisations, but those organisations don't want to throw out the
whole standard and 'go private'. It can imply that some TP extensions may be
incorporated back into the main body of the standard at a later point in
which case anyone pair or parties using that extension can agree a move back
to the standard definition, at a time of their choosing. It also allows the
schema owner to add non breaking 'compatible' change to a schema. The down
sides seem to be, that a TP could introduce changes which subvert the
intended semantics, and that, over time, what might have started out as a
temporary expedient, turns into an entrenched working implementation that is
unlikely to be allocated budget to be re-synchronised with the standard.
So, in part the question is, should a schema allow for unknown extensions
for unknown purposes (but in specified locations) and still be considered as
'compliant', or should schema authors attempt to constrain (eliminate) that
behaviour. I can't help feeling the attraction of the second model, but my
'gut' tells me that something as inflexible will soon become a business
constraint and that will signal it's demise.
With my SOA hat on I would recognise the importance of interoperability and
the significant role that standardised vocabularies have to play. I also
don't especially want to promote the myriad of point-to-point relationships
that 'going private' implies and instead want to leverage the 'reach' of a
market standard.
Personally I still have no definative conclusion that I feel comfortable in
turning into a recommended approach within my own organisation and within
the industry standards body that I work with from time to time, so I thought
I'd give it one more go.
Some of the issues and comments highlighted by the earlier thread are
provided below. Some are direct quotes from contributors, others are excepts
from various ramblings :-)
Cheers
Fraser
========================
- extensibility is a critical aspect of any data [or service] model. Without
extensibility all changes (however minor) effectively 'break' all provider
and consumer implementations.
- there are no 'minor' changes, any change implies a semantic difference.
- backwards compatible yes (the previous version of a schema must be a valid
instance of the new version), but not necessarily the other way around
- xs:any together with the 'sentry' approach proposed by David Orchard (and
others) provides a mechanism that allows XML schema to be extended by both
the schema namespace owner and a non schema author independantly, in a
manner which supports forwards and backwards compatibility for instance
documents. That is, some category of change can be accomodated which do NOT
cause either the consumer or provider implementation to REQUIRE change. Of
course extensions added by non schema owners represent a private
relationship between the communicating parties and therfore require an out
of band exchange of the type definitions and semantics. Also such extensions
can only be applied to specific locations in the base schema AND using a
foreign namespace. This is sometimes referred to as the 'must ignore'
pattern.
- A 'big bang' approach to versioning is not usually achievable in any
practical sense. That is, it is generally not possible to enforce a
'breaking' change on all users of a schema/service simultaneously (or even
within a constrained time window).
- Support for a version of a schema/service can in some cases be self
regulating. That is, if provider A only supports version 1.0 of a service
whilst the majority of consumers expect to be able to integrate with version
1.1 (or 2.0), then chances are that provider A will be unable to win any
business and will therefore be forced to upgrade. If a consumer supports
version 1.0 but all potential [preferred] providers have upgraded to a later
version, the consumer may not be able to place any business on behalf of its
customers, and will therefore be forced to upgrade (assuming that version
1.0 and later versions are NOT backwardsly compatible).
- a schema or service interface is immutable. Once published it should never
be changed (perhaps this is better stated as the operations which make up
the service interface should never be changed).
- support for concurrent versions of a schema/service is more effective
method of dealing with change than through schema extensibility. It makes
versions explicitly typed without the ambiguity of untyped sections (xs:any)
which require some out of band mechanism to be entered into by each
participant. Implementing an explicit new version has the crucial advantage
that it is guaranteed NOT to break a consumer implementation using the
current vesion unless the provider removes that version.
- Any change to a schema represents a semantic difference and therefore
cannot be considered as 'minor' and therefore requires a new version.
- We have come to the conclusion that semantically the definition of an
enumerated field is its enumerations. Therefore changing the enumerations
changes the definition. Adding enumerations locally seems like a poor
practice.
- Adding a new value to a enumeration is not a compatible change if that
value could be returned to a consumer who currently doesn't know about it
(using the previous schema definition). If it's just of the receiving side,
it MAY be compatible since the previous version remains a valid sub-set.
- schema's defined and managed by a standards body often move too slowly to
accomodate the business priorities of particpants. Allowing local extensions
can enable an organisation to gain advantage from the broader 'reach' of the
base standard to the majority of its partners whilst supporting specific
third party relationships which require additional [private] data not
[currently] available within the base standard. Sometimes this additional
data can represent a 'candidate' standard which may be encorporated at a
future time.
- When standards become an inhibitor to business operations they will be
usurped by local arrangements.
- Value based validation can be implemented as a separate layer, on top of
structural conformance.
- Synchronisation of schema variants is necessary at the point when the
number of variants indicates that the original semantics may have become
obfusticated or a new semantic ecosystem [related] is emerging.
- If a large number (more than 1 :-) of buisness transactional schema
include a common complex type, and that complex type needs to be changed,
this can create a synchronisation problem. So is there a differnt approach
to dealing with versioning of shared types ?
- We are undertaking a new position where the schema are going to be used
solely for structural validation, and code list value validation (as agreed
upon by trading partners) is a separate step.
|