xml-dev - Re: [xml-dev] How much run-time validation do you do?

Re: [xml-dev] How much run-time validation do you do?

[ Lists Home | Date Index | Thread Index ]

To: "Roger L. Costello" <costello@mitre.org>,<xml-dev@lists.xml.org>
Subject: Re: [xml-dev] How much run-time validation do you do?
From: Frans Englich <frans.englich@telia.com>
Date: Mon, 20 Dec 2004 16:20:30 +0000
In-reply-to: <200412201420.iBKEK2K23784@smtp-bedford.mitre.org>
References: <200412201420.iBKEK2K23784@smtp-bedford.mitre.org>
User-agent: KMail/1.5.4


I don't know how much value I add to the discussion, but here is nevertheless 
the decisions I did concerning run-time validation. 

My project is a regression/automation framework[1], where a central 
"Dispatcher" takes files as input, and executes tests out-of-process, with 
the files' characteristics as decision points on what tests that are run. The 
data that flows(stdin/stdout) between the Dispatcher and the tests are all 
XML formats, and so is all meta-data, such as the "Test Descriptors" which 
provides information about what types of files a particular test is relevant 
to, for example.

All data, outcoming and incoming, is validated. The answer to why that is the 
best approach(at least I hope so!), is found in how the framework is used, 
and what its goals are.

(since it haven't yet been brought to use, the discussion is from how it is 
supposed to be used)

The tests are written by different people, and added on a regular basis. 
Hence, there is tests which inevitable are buggy because they are under 
development.

The framework has as mission to be user friendly(not require manual 
intervenience for example), and to provide stable, exact, and correct 
results. Its output cannot be undeterministic.

Since one of the goals is to be robust, and that a large part of the whole(the 
tests) are constantly in potentially unstable development states, the only 
option is to validate in order to not compromise the robustness. There is at 
least a theorethical performance impact, but I rather have that than buggy 
software that have a high maintenance burden.

I use libxml2 via the Python bindings; the schemata is compiled once, the data 
is serialized anyway, and libxml2 is very fast, so I think teh validation is 
close to statistical noise, inbetween the context switches for example.

Since the tests creates the "uncontrolled environment" it makes perhaps the 
validation understandable, but why do I validate output data from the 
"Dispatcher"? It's afterall under controlled development, with a finite 
development period. Again, it's because I rather sacrifice performance, in 
front of potential instability.

Since validation adds the possibility 	for graceful error control, it makes a 
system much more robust. I don't see how a system could become stable in 
real-world conditions without validation(useful that is), unless it is 
absolutely _guaranteed_ that the data is correct.


Regarding validator performance, here is a benchmark:
http://xmlbench.sourceforge.net/results/benchmark200402/index.html


Cheers,

		Frans

1.
It is in-house software for KDE, www.kde.org, developed privately by me, but 
will be published under GNU GPL in a project-neutral way once it has reached 
a state suitable for open source development(post alpha basically).

References:
- How much run-time validation do you do?
  - From: "Roger L. Costello" <costello@mitre.org>

Prev by Date: RE: [xml-dev] [xsd] Guidelines [was: [xsd] Schema in multiple documents]
Next by Date: Re: [xml-dev] A Systematic Approach to using Simple XML Vocabularies to Implement Large (Complex) Systems
Previous by thread: How much run-time validation do you do?
Next by thread: Re: [xml-dev] How much run-time validation do you do?
Index(es):
- Date
- Thread