OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Validation - Is it worth it ?

[ Lists Home | Date Index | Thread Index ]

Thanks Rick, its nice to have a shared problem.

We do use schemaTron to check for constraints beyond that which can be 
expressed in XML schema and for items that we must absolutely have. This is 
important from the perspective that, when using schema created by a number 
of industry vendors, the compromises reached tend to result in quite a lot 
of 'optional' content model parts. Despite what the schema says, often-times 
those same data items are mandatory for our business process.

You mention defining rules to test for data which we 'dont want'. I guess I 
tend to think about the 'must ignore' pattern here where if I get sent 
something that I'm not interested in I either ignore it or explicitly filter 
it (if I think it might be harmful to any 'up-stream processing) ??

On schema, yes we could (and perhaps should) use a modified version as you 
suggest. I guess some of our reluctance is in the area of change management 
and maintenance. We already face some tough challenges around schema 
versioning (the standards body doesn't have a great approach to this at 
present - who does !) and we are trying to protect our internal processes 
from the impacts of external change. What this means is we use (or are in 
the process of developing) an internal data model which, while cogniscant of 
the external standard is not the same. It acts as a buffer to changes that 
we are not necessarily interested in. Of course, all of this comes at a 
cost. It means that we have to create and maintain our own schema, XSLT, 
validation rules (schemaTron) and so on, as well as all of the versioning of 
those artefacts in line with changes outside that we can't ignore and those 
in-side that we want. I guess I want my cake and eat it (ie. get the 
standards body to do most of the work).


>From: "Rick Jelliffe" <rjelliffe@allette.com.au>
>To: xml-dev@lists.xml.org
>Subject: Re: [xml-dev] Validation - Is it worth it ?
>Date: Sun, 12 Feb 2006 00:05:32 +1100 (EST)
>Yours is a very common position to be in.
>There are all sorts of intermediate kinds of partial validation
>possible and useful. The choice isn't between all or nothing.
>For example, you could make a version of the standard schema to
>redefine elements so that you only validate datatypes: complex
>content would just have some wildcarded anything-goes content
>What this would give you is a system that is liberal in what it
>accepts. This is certainly better than no validation.
>Another way to look at the problem is from the perspective of
>test-driven development. You can validate everything initially,
>until your feeds have proven themselves, then reduce to sampling
>using the standard statistical practise. Or look at it as
>an opportunistic thing: even if your servers are too slow to
>cope with validation during peak period, you could enable it
>at off-peak times.
>Another approach entirely is to express your business rules
>in Schematron, and validate using that instead of the
>standard XML Schema. This allows you to only check the things
>you are interested in, cope with partial and incomplete
>documents-in-progress (compared to the standard schemas) but also to
>document what you are interested
>and also to check for things you positively don't want in your
>data: this is a lot more powerful than type derivation in this
>Rick Jelliffe
>Fraser Goffin said:
> > Thanks Greg, some interesting points to consider.
> >
> > I am mostly concerned with B2B. One of the issues I'm wrestling with is
> > that
> > :-
> >
> > a. the service contract is defined by an external standards body (we are
> > but
> > one implementer).
> > b. the data model that underpins the service operations are defined 
> > XML schema and these reflect the broad business semantics for each
> > operation
> > (as agreed by a panel of contributors from our industry sector).
> > c. our business rules (in terms of what data content/structural
> > constraints
> > that would be acceptable) are less strict than the XML schema specifies
> > (for
> > example we may be tolerant of missing data).
> >
> > So I guess I was considering whether we should validate according to our
> > internal business rules rather than that of the externally defined
> > contract,
> > even when this can mean that a message received could be schema invalid
> > (according to the industry standard definition) ?
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://www.oasis-open.org/mlmanage/index.php>


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS