OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Validation - Is it worth it ?

[ Lists Home | Date Index | Thread Index ]


we do have an obligation to provide an implementation of the industry 
standard contract. Actually we could define our own, but the majority of 
traffic will be routed to us via an industry portal to which the vast 
majority of service consumers will connect (and receive there copy of WSDL, 
schema etc.). So, we could go private, but that would directly effect the 
potential 'reach' of our services when everyone else is using a standard. I 
also do not favour a separately negotiated point-to-point bespoke interface 
for every caller for obvious reasons. So, standards make sense and are 
[somewhat] useful (I guess at this point I could stop and agree with your 
entire argument that if everyone agrees to abide by the standard then 
validating against it seems perfectly reasonable !).

As for obligations go both ways, I also tend to agree, but at one level I'm 
trying to avoid the situation of .. 'its my service contract, so I define 
the rules, either play by them or I'll reject you :-). Anyway, I digress.

So, the service is advertised as billed - the industry standard. But 
standards developed by a bunch of 'competitors' tends to lead to definitions 
which attempt to appease everyone and thus both to a high degree of 
'optionality' and some things that you get 'arm-twisted' into swallowing. In 
these areas I have two opposing problems. First, something defined in the 
standard as optional may be absolutely mandatory for my specific business 
process, so I can't rely on schema validation for that (we typically use 
schemaTron or leave it to the business process to reject with a business 
response - depneds on what it is). Second, 'stuff' may appear in the message 
which we don't actually care about, but in terms of XML schema, it could be 
invalid. Rejecting that message could result in us turning away business 
that we would have been happy to write ?

Your last two points I entirely agree with. It is feasible for us to operate 
our own set of modified schema and additional rules, but I am worried about 
the amount of effort required to maintain these over time in the face of 
change (they are definately very non trivial). We do make attempts to 
maintain a delta, but it is hard work and hard to get funding for. I don't 
want to operate as a standards body (we contribute our annual fee to our 
particular sectors' body for that). As for clients, I prefer to not make any 
assumptions at all about their capabilities.

All of this is exacerbated by a relatively weak versioning model for XML 
schema. We are giving this a lot of thought at the moment (all suggestions 
welcome :-), in combination with trying to protect our internal systems from 
the impacts of changes in the external standards (e.g. an internally managed 
canonical data model - blimey, I've lost count of the times we have tried 
things along the lines of Enterprise data Model - and failed !).


>From: Greg Hunt <greg@firmansyah.com>
>To: Fraser Goffin <goffinf@hotmail.com>,  xml-dev@lists.xml.org
>Subject: Re: [xml-dev] Validation - Is it worth it ?
>Date: Sun, 12 Feb 2006 03:43:46 +1100
>I think its a matter of how the external interface contract is specified.
>If you advertise yourself as using the industry standard schema, then that 
>sort-of resolves the issue.  The obligations from the advertisment go both 
>If you don't, how do you convey your flavour of the schema to someone else 
>witout opening yourself up to endless tedious discussion of what the 
>element relationships are?  There are options, other people have suggested 
>them, but you need to look at costs and benefits and at your capacity to 
>impose the model.
>You could specify things in terms of additional technologies, such as Rick 
>Jelliffe's suggestion, but if the schema is non-trivial then that is a 
>large amount of work in a technology that is different to the standard 
>specification, which might be a hard sell.  A test-driven approach assumes 
>that you know something about the stability of your clients and that there 
>are not too many of them.  This might be the case.
>Making your own version of the standard schema would be feasible if you 
>have significant market power, but there is a lack of tools to maintain the 
>delta on the standard schema in a consistent way and your partners may not 
>be all that wild about having to support your variant as well as the 
>standard.  If you de-facto define the standard then that is different, your 
>market power will encourage or require the other participants in the market 
>to support your model.  There was a discussion some time ago about 
>profiling schemas in the same way that EDI messages are profiled, but the 
>idea was unexpectedly novel to a lot of people here.
>Fraser Goffin wrote:
>>Thanks Greg, some interesting points to consider.
>>I am mostly concerned with B2B. One of the issues I'm wrestling with is 
>>that :-
>>a. the service contract is defined by an external standards body (we are 
>>but one implementer).
>>b. the data model that underpins the service operations are defined using 
>>XML schema and these reflect the broad business semantics for each 
>>operation (as agreed by a panel of contributors from our industry sector).
>>c. our business rules (in terms of what data content/structural 
>>constraints that would be acceptable) are less strict than the XML schema 
>>specifies (for example we may be tolerant of missing data).
>>So I guess I was considering whether we should validate according to our 
>>internal business rules rather than that of the externally defined 
>>contract, even when this can mean that a message received could be schema 
>>invalid (according to the industry standard definition) ?
>>>From: Greg Hunt <greg@firmansyah.com>
>>>To: Fraser Goffin <goffinf@hotmail.com>,  xml-dev@lists.xml.org
>>>Subject: Re: [xml-dev] Validation - Is it worth it ?
>>>Date: Sun, 08 Jan 2006 10:31:11 +1100
>>>I know that "be liberal in what you accept and strict in what you emit" 
>>>is firmly embedded in the race memory now, but I am not sure that that 
>>>applies to the technical aspects of B2B type transactions.
>>>Why do you want to process messages that "may" be processable?  If you 
>>>keep in mind a distinction between technical processes and business 
>>>processes, there is less doubt.  Technically acceptable messages should 
>>>enable the business process.  Acceptance of the message should mean that 
>>>there is a business process that will handle the message.  Rejection of 
>>>the message should indicate technical issues.  Anything else is probably 
>>>not scalable in terms of effort  and in terms of guaranteed processing 
>>>time.  It seems to come down to whether the business rules for fixing the 
>>>message data can be defined and used.  We need to be careful that we are 
>>>not also fixing the semantics of the messages that we receive.
>>>If there are business rules that define the changes to data content that 
>>>are acceptable then there is no doubt about whether the message can be 
>>>processed, what is in doubt is whether in the end you get a business 
>>>transaction (money) out of it.
>>>If there are technical issues with a message, if the received structure 
>>>is wrong or unexpected for some reason, then the semantics are also in 
>>>doubt and passing what is probably machine generated data to a person to 
>>>resolve an issue involves asking that person to decide whether this is a 
>>>bug or something unexpected in the mind of a programmer.  It is easier 
>>>and more reliable to ask the originator what is going on than it is to 
>>>ask a human to interpret a broken XML structure for, for example, a 
>>>purchase order (meaning that I am not sure that you can always push the 
>>>message to a person to have them look at it in a scalable, 
>>>business-meaningful way).
>>>If this is B2B type traffic, who pays for getting the semantics of the 
>>>message wrong?  If you receive a value transaction from someone and "fix" 
>>>it in some technical and non-trivial way, who now pays for the 
>>>transaction?  Is the cost of sending and receiving the message so high 
>>>that the time and effort of a human and the associated risk of further 
>>>error is warranted?
>>>Fraser Goffin wrote:
>>>>Recently I've been involved in building a validation process using a 
>>>>combination of schema based and rules (schemaTron) and it got me 
>>>>thinking about how much validation is the right amount.
>>>>The 'type' of validation processing I'm talking about is that which 
>>>>might be performed at a B2B gateway and is perhaps better categorized as 
>>>>'technical' validation (ie. basic structural conformance and some 
>>>>content) rather than business rules (although the distinction is pretty 
>>>>>From the business perspective, it is undesirable to reject any message 
>>>>thus lose an opportunity to complete a transaction. So from this point 
>>>>of view one might imagine that validation at this stage should be 
>>>>minimal, perhaps not even full schema (or perhaps a 'more relaxed' 
>>>>version of the published interface). This might be justified on the 
>>>>basis that rules, perhaps in a business process engine or application 
>>>>logic, are better at determining whether a message is business 
>>>>processable or not. Plus one can always push messages to a manual 
>>>>process and let a human decide !
>>>>On the flip side, we want to protect the integrity of our operational 
>>>>systems from erroneous data and, perhaps the most obvious reason, 
>>>>validation can provide an optimization of the process in the sense that, 
>>>>when the interaction is asynchronous (and possibly long running), it may 
>>>>be preferable to let a caller know right away that a message has some 
>>>>'bad data' rather than for them to find that out some time later after 
>>>>having received an initial acknowledgement of receipt.
>>>>To me this highlights the conundrum of a desire for strongly typed 
>>>>[service] interfaces versus the looser coupling and tolerance to change 
>>>>that we also typically seek. I am trying to find the 'sweet spot' that 
>>>>allows through messages that 'may' be processable, but rejects those 
>>>>where even if directed to a manual (human workflow) process would still 
>>>>not be worth the effort. I sometimes refer to this as 'compatible' 
>>>>messages versus enforcing strict adherence to a technical specification.
>>>>I also have noted that versioning service interfaces (or even just XML 
>>>>schemas) can be somewhat problematic and can exacerbate validation 
>>>>issues, and to some extent mitigates against using them for validation 
>>>>purposes, particularly if they haven't been designed with any 
>>>>extensibility mechanisms at all to accommodate 'non breaking' change 
>>>>(e.g. xs:any/anyAttribute).
>>>>Some of you may be thinking 'is there a question here anywhere ?', sorry 
>>>>I have meandered on somewhat. What I'm really after is finding out what 
>>>>others have found to be a good approach to message validation and 
>>>>whether there are views about how to achieve a balance between 
>>>>optimizing business opportunity and rejecting 'junk mail'.
>>>>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>>>>initiative of OASIS <http://www.oasis-open.org>
>>>>The list archives are at http://lists.xml.org/archives/xml-dev/
>>>>To subscribe or unsubscribe from this list use the subscription
>>>>manager: <http://www.oasis-open.org/mlmanage/index.php>
>>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>>initiative of OASIS <http://www.oasis-open.org>
>>The list archives are at http://lists.xml.org/archives/xml-dev/
>>To subscribe or unsubscribe from this list use the subscription
>>manager: <http://www.oasis-open.org/mlmanage/index.php>


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS