Hi Folks, Thank you for your excellent responses! If I may, a few follow-on questions please: 1. Round-tripping does not prove correctness, right? That is, (e) in the following is false, right?
(a) Translate an instance of Format1 to an instance of Format2 (c) If the round-tripped instance of Format1 is identical to the original instance of Format1, then this is evidence that the Format1 to Format2 conversion is correct. (d) Repeat (a) to (c) with many other instances. (e) If (c) is successful for all instances, then the Format1 -> Format2 mapping is correct for those parts of Format1 that were tested. I believe that round-tripping proves nothing: Suppose I map field A in Format1 to field B in Format2 and then I map field B in Format2 to field A in Format1. That doesn’t
prove that A -> B is a correct mapping. Nor does it prove that B -> A is a correct mapping. Do you agree? 2. Suppose there are two applications, App1 and App2, which are designed to perform the same task. The input to App1 is instances of Format1 and the input to App2 is
instances of Format2. One way of ascertaining the correctness of the Format1 --> Format2 conversion is to compare the outputs of App1 and App2. That is, for each instance of Format1, input the instance into App1 and map the instance to Format2 to generate
instance2; input instance2 into App2 and then compare the outputs of App1 and App2. Do you agree with the following assertions? Assertion #1: Comparing the outputs of App1 and App2 is equivalent to comparing the behavior of App1 and App2. Assertion #2: Comparing application behaviors is a very hard problem.
Assertion #3: Comparing syntaxes is a relatively easy problem. Assertion #4: When trying to ascertain the correctness of a Format1 --> Format2 mapping, do everything you possibly can to turn the problem into one of comparing syntaxes,
not one of comparing application behaviors. 3. Here are two approaches to generating XML from an instance of Format1: Approach #1: Write a program that parses an instance of Format1 and then serializes the in-memory parse tree to the desired XML format.
Approach #2: Using a standard data format specification language (i.e., DFDL), write a specification of the Format1 data format. Then input that specification into a
standard DFDL processor along with an instance. The processor parses the instance in accordance with the specification and automatically serializes the in-memory infoset to XML.
Which approach is more likely to correctly generate XML documents from Format1 instances? I assert that the DFDL specification approach is
guaranteed to correctly generate XML documents from Format1 instances. The hand-crafted approach is much less likely to correctly generate XML documents from Format1 instances. (Well,
guaranteed might be a bit strong, but I am pretty darn certain that correct XML is generated.) So there is no uncertainty in whether the conversion from Format1 to XML is correct. Likewise, there is no uncertainty in whether the conversion from Format2 to XML
is correct. The only uncertainty is whether the conversion from Format1 to Format2 is correct. This is huge. Michael Sperberg-McQueen wrote:
Eek! That is a serious problem. I have no idea how to deal with that. /Roger |