XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: Run-Time Validation of Inbound XML Documents - Yea or Nay?

Roger, your paper neatly summarizes the dilemma we often face, that some sources are highly reliable, and others are highly unreliable (unsophisticated).  Unfortunately, not all unreliable sources are outside the firewall.  Some systems engineers feel that after they've tested their output with validation, they can turn it off in a production mode.  While this does tend to confirm the correctness of a schema, it does nothing for the data.  When the data itself arrives at the rate of about 5,000 documents a day from PC's located around the world using code pages just as varied, there is ample opportunity for garbage in and out.

The laissez faire browser approach won't do for us, but the sources outside just can't be expected to produce perfectly valid documents, either.  Fortunately, there's another alternative, one that we're exploring, that we call "progressive" validation.  We're creating multiple versions of the same schema that are used at different stages of internal processing.  The first validation needs to be very forgiving, in parallel with the business process, but the second and third validations are progressively more stringent, just as the business process requires that any deficiencies in the original filing have to be corrected before the next steps can be taken.  The final validation, before we publish the final, negotiated content, is the most stringent and tolerates no errors.

We did a study some years ago about how much value we'd get from validating certain key data types (patent numbers from various countries).  Although schemas provide some level of validation, and Schematron can expand on that, we found that there was always some residue that would require custom code to fully validate the data.  You might want to temper your recommendation to use COTS to validate with the recognition that it might require some supplementation; that it will reduce, if not eliminate, custom validation code.

All the best,
Bruce B Cox
Director, Policy and Standards Division, OCIO
U.S. Patent & Trademark Office

-----Original Message-----
From: Costello, Roger L. [mailto:costello@mitre.org] 
Sent: 2011 May 21, Saturday 10:13
To: xml-dev@lists.xml.org
Subject: Run-Time Validation of Inbound XML Documents - Yea or Nay?

Hi Folks,

Issue

An application receives an XML document. Should the application validate the XML document prior to processing it? That is, should applications perform run-time validation of inbound XML documents?

Discussion

There is no right or wrong answer to this question. There are only engineering tradeoffs. So, before making a decision for your particular application, it is important to understand the approaches, their advantages, and their disadvantages.

More ... http://www.xfront.com/Run-time-Validatation-of-Inbound-XML-documents.pdf 

Comments welcome.

/Roger


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS