OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: Fallacies of Validation ... RE: [xml-dev] Are people really using Id

[ Lists Home | Date Index | Thread Index ]
  • To: "Roger L. Costello" <costello@mitre.org>, <xml-dev@lists.xml.org>
  • Subject: RE: Fallacies of Validation ... RE: [xml-dev] Are people really using Identity constraints specified in XML schema?
  • From: "Cox, Bruce" <Bruce.Cox@USPTO.GOV>
  • Date: Wed, 25 Aug 2004 17:22:39 -0400
  • Thread-index: AcSK6aYv2TJbNJCsS0+D/Z1fVMereA==
  • Thread-topic: Fallacies of Validation ... RE: [xml-dev] Are people really using Identity constraints specified in XML schema?

We haven't done it yet, but almost from the start of negotiations for a
DTD for patent applications we recognized the need for multiple schema.
Getting a filing date is a key event for an applicant, and current rules
(paper or electronic) are geared to make that as easy as possible.  A
fee must be paid to get a filing date, but if not paid with the
application, then the a notice is given to cure the deficiency by a
specified date, or the application will be considered abandoned.  If the
fee is paid by that date, the original filing date is secured, and if
not, the applicant has to start over.  There are several other
requirements for filing that follow the same pattern, so when validating
data on the client side at the time of submission, applicants are given
the opportunity to "override" error messages and send the (defective)
application anyway.  At later stages of processing, validation is more
stringent, and different versions of the DTD or schema will ensure that
human input can be properly processed by machine at later stages in

If you are integrating legacy systems through an EAI hub, it may be that
the only possible place for schemas and schema validation is at the
outermost edge, that is, the boundary between the systems and the hub.
Migrating legacy systems to newer technology is at least 50% a social
issue, not a technical issue, so I'm not sure it's fair to characterize
this as a fallacy.

For the last one, I'm not sure the "fallacy" is "you must validate" so
much as "you must structure the data".  Markup costs money, but it also
adds value, and if there is no value extracted, then don't pay the price
up front (unless your future is really uncertain and you'd rather incur
a lower cost for structuring now than a potentially much higher cost in
the future).  If you do structure the data, in some cases it will pay to
validate as well.  For example, from the start of automation at the
Trademark Office of the US Patent & Trademark Office, correspondence
addresses were not structured, but captured only as line 1, line 2, etc.
This worked fine until about two years ago when the USPTO wanted to
decrease the cost of sending official notices to trademark applicants
and registrants via US mail.  The USPTO began using a service of the US
Post Office whereby we transmit address/message pairs to them in bulk,
and the Post Office prints postcards near their destination (domestic
only), at a substantial savings to us over the cost of printing,
postage, and moving around big bins of postcards.  To do this, however,
the Trademark Office had to provide addresses that were (minimally)
structured.  It was clearly a cost advantage to go to the expense of
parsing all the addresses (millions of them) to the necessary structure
(we're talking Department of Commerce gold medals, here).  In a case
like this, validating key components of the address (checking against an
address database, for example) before transmitting to USPS ensures
timely delivery of an official notice, and is judged worth the cost.

Bruce B. Cox

-----Original Message-----
From: Roger L. Costello [mailto:costello@mitre.org] 
Sent: Wednesday, August 25, 2004 9:05 AM
To: xml-dev@lists.xml.org
Subject: Fallacies of Validation ... RE: [xml-dev] Are people really
using Identity constraints specified in XML schema?

Hi Folks,

From reading yesterday's messages, I feel like the real issues are
coming out.  And the real issues, I perceive, are in the various
fallacies with validation.  Below I provide a start at listing the
fallacies.  Your help in elaborating these is needed.

Fallacies of Validation

1. Fallacy of "THE Schema"

2. Fallacy of Schema Locality

3. Fallacy of Requisite Validation

Let's examine each of these fallacies.

1. Fallacy of "THE Schema"

This fallacy was identified by Michael Kay last week:

> ... there's no harm in using XML Schema to check data against the 
> business rules, so long as you realize this is *an* XML Schema, not 
> *the* XML Schema. We need to stop thinking that there can only be one 
> schema.

Yesterday Len Bullard made a similar statement:

> ... most fundamental errors are ... to consider only a single schema.

and at another point Len states:

> ... fall into the trap of thinking of THE schema and not recognizing 
> the system as a declarative ecosystem of schemas and schema 
> components.

Both Michael and Len are stating that in a system there should be
numerous schemas.  This is a big mindshift for me.  I admit being
trapped into thinking that there should be a single schema.

It would be very useful if we could have a simple example that shows how
several schemas might be employed, rather than a single schema.  Could
someone provide an example?  

Len, I like the term you used, "declarative ecosystem".  Could you
elaborate upon what this means?

2. Fallacy of Schema Locality

Yesterday Len also identified this fallacy:

> ... most fundamental errors are to consider schemas only at the 
> external
system junctions ...

Len notes that many people think that validation should occur at a
certain place in the system, namely, at the outermost edges of the
system.  (Len, I assume this to mean the user-interface?)  Len argues
that validation can rightfully be done at many locations in a system.
Len, perhaps some more words on this fallacy would be in order?

3. Fallacy of Requisite Validation

Yesterday Michael Kay made a very compelling statement with regards to
whether validation should be done at all in certain situations.  Michael
was responding to the example of an online service validating a user's
Here's what Michael said about the online service's insistence on
validating the user's address:

> The strategy (validating the user's address) assumes that you know 
> better than your customers what constitutes a valid address. Let's 
> face it, you don't, and you never
> will. A much better strategy is to let them (the user) express   
> their address in their own terms. After all, that's what they do in 
> old-fashioned paper correspondence, and it seems to work quite well.

Michael argues very effectively that in this situation it makes no sense
to do any validation at all!

I have not yet read all of yesterday's postings, so I may have missed
some other fallacies.  If you know of any fallacies that I missed, would
you please send them along?  

Also, if you have comments on the fallacies identified above, please
send them along.  Note: examples are much needed!



News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS