Lists Home |
Date Index |
Thanks Rick, its nice to have a shared problem.
We do use schemaTron to check for constraints beyond that which can be
expressed in XML schema and for items that we must absolutely have. This is
important from the perspective that, when using schema created by a number
of industry vendors, the compromises reached tend to result in quite a lot
of 'optional' content model parts. Despite what the schema says, often-times
those same data items are mandatory for our business process.
You mention defining rules to test for data which we 'dont want'. I guess I
tend to think about the 'must ignore' pattern here where if I get sent
something that I'm not interested in I either ignore it or explicitly filter
it (if I think it might be harmful to any 'up-stream processing) ??
On schema, yes we could (and perhaps should) use a modified version as you
suggest. I guess some of our reluctance is in the area of change management
and maintenance. We already face some tough challenges around schema
versioning (the standards body doesn't have a great approach to this at
present - who does !) and we are trying to protect our internal processes
from the impacts of external change. What this means is we use (or are in
the process of developing) an internal data model which, while cogniscant of
the external standard is not the same. It acts as a buffer to changes that
we are not necessarily interested in. Of course, all of this comes at a
cost. It means that we have to create and maintain our own schema, XSLT,
validation rules (schemaTron) and so on, as well as all of the versioning of
those artefacts in line with changes outside that we can't ignore and those
in-side that we want. I guess I want my cake and eat it (ie. get the
standards body to do most of the work).
>From: "Rick Jelliffe" <email@example.com>
>Subject: Re: [xml-dev] Validation - Is it worth it ?
>Date: Sun, 12 Feb 2006 00:05:32 +1100 (EST)
>Yours is a very common position to be in.
>There are all sorts of intermediate kinds of partial validation
>possible and useful. The choice isn't between all or nothing.
>For example, you could make a version of the standard schema to
>redefine elements so that you only validate datatypes: complex
>content would just have some wildcarded anything-goes content
>What this would give you is a system that is liberal in what it
>accepts. This is certainly better than no validation.
>Another way to look at the problem is from the perspective of
>test-driven development. You can validate everything initially,
>until your feeds have proven themselves, then reduce to sampling
>using the standard statistical practise. Or look at it as
>an opportunistic thing: even if your servers are too slow to
>cope with validation during peak period, you could enable it
>at off-peak times.
>Another approach entirely is to express your business rules
>in Schematron, and validate using that instead of the
>standard XML Schema. This allows you to only check the things
>you are interested in, cope with partial and incomplete
>documents-in-progress (compared to the standard schemas) but also to
>document what you are interested
>and also to check for things you positively don't want in your
>data: this is a lot more powerful than type derivation in this
>Fraser Goffin said:
> > Thanks Greg, some interesting points to consider.
> > I am mostly concerned with B2B. One of the issues I'm wrestling with is
> > that
> > :-
> > a. the service contract is defined by an external standards body (we are
> > but
> > one implementer).
> > b. the data model that underpins the service operations are defined
> > XML schema and these reflect the broad business semantics for each
> > operation
> > (as agreed by a panel of contributors from our industry sector).
> > c. our business rules (in terms of what data content/structural
> > constraints
> > that would be acceptable) are less strict than the XML schema specifies
> > (for
> > example we may be tolerant of missing data).
> > So I guess I was considering whether we should validate according to our
> > internal business rules rather than that of the externally defined
> > contract,
> > even when this can mean that a message received could be schema invalid
> > (according to the industry standard definition) ?
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>To subscribe or unsubscribe from this list use the subscription