[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Caution using XML Schema backward- or forward-compatibility as a versioning strategy for data exchange
- From: "Fraser Goffin" <goffinf@googlemail.com>
- To: "Michael Kay" <mike@saxonica.com>
- Date: Thu, 27 Dec 2007 13:32:07 +0000
Some general thoughts :-
1. XML Schema, Relax, or any other vocabulary for decribing data do
not provide any mechanism for preventing people from abusing those
structures for purposes that differ from their original intent, in the
(imo) mis-guided belief that doing so will provide a less intrusive
opportunity to accomodate change (particularly in the case of semantic
drift that Micheal refers to above). More often than not this leds to
complexity and assumed knowledge which, when taking Rogers' scenario
of a business service with 'unknown' consumers, will typically just be
storing up a more difficult and possibly intractable problem for later
on (i.e. once users of the service have got that 'bad' design baked
into their applications - that is, applications that you as the
service provider have no control over - they are not going to be
motivated to make any changes whilst they are still working - only
solution is to introduce a breaking change and as a provider, support
multiple versions).
2. Using information items for purposes other than their original
intent is usually a very bad idea. It just provides a short-term fix
and a long term pain (for providers and consumers alike - vocabulary
owner and vocabulary user.).
3. In nearly all cases sematic changs are ALWAYS breaking changes
(notwithstanding that you might think you can get away without because
the syntax doesn't change - in some ways this very lack of
visibility is the problem). Clearly this is even more the case when
talking about business services offered to 'unknown' consumers (or at
least consumers with applications whose development resource is not
under the control or influence of the service provider). Therefore
semantic changes are ALWAYS INcompatible so
forwards/backwards/whatever doesn't apply.
4. A minor (excuse the pun) point. Many people use a pattern of
<Major>.<Minor> for indicating the version of something (schema,
service, whatever). An increment of the MINOR value can be used to
identify a [possible] compatible change, depending on your
[advertised] versioning strategy. Whereas, a change to the MAJOR
version number is an EXPLICIT SIGNAL of an INcompatible (breaking)
change.
5. Schema validation is typically NOT the same as validation in the
context of a business process (i.e. data can be valid according to a
data definition, but invalid when processing, since other contextual
rules apply). We may often use schema validation as an optimisation to
provide more immediate feedback to a caller about the possibility that
their request may be fulfilled (or not), but not IMO as a way of
guaranteeing that any upstream business process will be successful,
and definately not as a way of protecting the integrity of a core
business application (now that *would* be a dangerous approach).
We also appreciate that most vocabularies have limitations in terms of
the data and business rules that they can express, so naturally we
often apply multiple techniques and apply them at different stages of
the processing chain (e.g. a bit of structural and data content
checking early on to weed out obviously invalid messages, up to state
machines and complex business rules deeper into the process). What is
more, the outcomes of validation failures often differ depending on
what stage they are detected. Of course we are loathed to spread our
business data validation rules across the potentially many layers of
our application infrastructure, but, we may choose to do this in a
controlled way to provide process optimisation, so even using a
solution such as DataPower might only extend to a certain category of
validation checking.
6. There are occasions where we would either not locate ALL validation
processing inside the receiving application logic or where we would
choose to duplicate parts of it further out. In those cases we
wouldn't remove it after testing (lets face it, testing is often a
problematic area for complex integration scenarios and often is not as
comprehensive as it might be). I find that this is particularly the
case for EXTERNAL integration (i.e. you only control one side of the
exchange) and where business services (external interface) needs an
SLA of 24/7 availability even when the actual processing application
is not.
Fraser
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]