Re: [xml-dev] defining correctness for an XML transformation

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

Re: [xml-dev] defining correctness for an XML transformation - how?

From: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
To: Rick Jelliffe <rjelliffe@allette.com.au>
Date: Fri, 05 Jul 2024 08:16:08 -0600

Rick Jelliffe <rjelliffe@allette.com.au> writes:

> I haven't read all the thread so apologies if I am re-making some
> point, or saying something dumb that has been pointed out.
>
> Three quick points.
>
> 1. If your schema is recursive, then the problem of determining
> whether it is always correct is NP, isn't it? Infinite number of
> inputs.

Certainly most programs are designed to handle an infinite number of
inputs; that is one reason testing cannot provide full assurance of
correctness.

I don't think that recursive properties in the specification of
correctness necessarily render proofs intractable.  Recursion and proof
by induction go hand in hand.

> Which, if so, probably needs to limit or condition our
> expectations or approaches.

Probably, yes.  There are many things we would like to do that we cannot
do; sometimes we can come close, or we can do them sometimes but not
always.  (Sometimes for example we can prove that a procedure
terminates, even though that problem is in the general case
undecidable.)  There are some things we would like to express that we do
not know how to express in a completely satisfactory way.  There are
some things we would like to be able to prove that we cannot prove.
Sometimes we can find an expression that is not quite what we want but
seems to be better than nothing; sometimes we can prove a property
related to the property we would ideally like to prove.

> 2. Obviously people use Schematron to validate pre and post
> conditions, including co-constraints between input and output all the
> time. it is a common use.

Yes, it seems likely that assertions expressed with XPath are currently
as close as anything we have to the kind of language needed.  And
Schematron provides a convenient way to package such assertions and
check them.  I'm not sure whether assertions in XPath can express the
kinds of properties needed to specify the functional correctness of
arbitrary transforms.  Some of them, yes.  Take the property that for
every 'name' element in document O there is some 'Airport_Name' element
in document I with a related string.  We can say

  every $o in $O//name satisfies
  some $i in $I//Airport_Name satisfies
  string($o) eq normalize-space($i)

Can we also express the property that the space-normalized form of the
string value of every distinct 'Airport_Name' element appears exactly
once in the output?  Maybe; I'll have to think about it.

Can we express the property that the input document I and the ougput
document O define grammars G and G' such that L(G) = L(G')?  I don't
know.

I'll have to read up on how to use Schematron for co-constraints on
input and output.

> ...

> 3. The problem might be considered as a category error: we do not
> actually want to validate the Inputs and outputs, we want to validate
> the XSLT.

I'm not sure what problem you refer to here.  The thread started with
the question

    Forget *showing* that an XML transformation is correct -- how would
    you define correctness formally, if you wanted to be able in
    principle to provide a machine-checkable proof of correctness?

I think there is a relation -- at least in some cases -- between
specifying the constraints on input and output (and their relation to
each other), on the one hand, and specifying correctness of the
transform on the other.  In some cases, at least, the correctness of the
transform can be described generically as: for any input satisfying the
pre-conditions, the transform produces output which satisfies the
post-conditions (including conditions on the relation of output to
input).

This situation has the advantage that even when we do not know how to
prove the correctness of the transformation in general, we can still
check the correctness of individual applications of the transform.

Whether correctness can always be specified that way, I do not know --
one reason I asked the list.

> I came up with an approach for a bank years ago where I converted an
> XSLT into just the possible element branches it could generate. This
> then could be stitched in with other similar data from subsequent
> stages in the pipeline, so that the result was an abstract document
> (on a spreadsheet) that showed for each input what elements could
> appear in the final output.  This helped them feel on top of the
> pipeline, to reason about flaws and behaviour, without needing XSLT
> expertise.

Excellent.

Michael

-- 
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com

Follow-Ups:
- Re: [xml-dev] defining correctness for an XML transformation - how?
  - From: Roger L Costello <costello@mitre.org>

References:
- defining correctness for an XML transformation - how?
  - From: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
- Re: [xml-dev] defining correctness for an XML transformation - how?
  - From: Tony Graham <tgraham@antenna.co.jp>
- Re: [xml-dev] defining correctness for an XML transformation - how?
  - From: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
- Re: [xml-dev] defining correctness for an XML transformation - how?
  - From: Roger L Costello <costello@mitre.org>
- Re: [xml-dev] defining correctness for an XML transformation - how?
  - From: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
- Re: [xml-dev] defining correctness for an XML transformation - how?
  - From: Rick Jelliffe <rjelliffe@allette.com.au>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]