Re: [xml-dev] xs:assert and Schematron

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
From: Rick Jelliffe <rjelliffe@allette.com.au>
To: Jesper Tverskov <jesper.tverskov@gmail.com>
Date: Tue, 16 Dec 2008 14:56:56 +1100
Jesper Tverskov wrote:
> I have tested xs:assert in the working draft of XML Schema 1.1 as
> implemented in Saxon.
>
> It surprises me that we now can do anything in XML Schema directly,
> that we can do with Schematron. Except for Schematron's user-defined
> error messages and a nice report format.
>   

I think xs:assert solves the kind of problem faced by people have who 
find XSD useful
in the first place (which is reasonable), however there is a large world 
of people with
documents and problems that don't fit that, and Schematron often 
provides a good fit for them.
(I don't see that XSD can ever be made to fill Schematron's shoes, nor 
vice versa.)
> So this is my question. Is there any testing we can do with Schematron
> that we can't do in XML Schema 1.1 directly?
>   
It is not at all true that you can do anything with draft XML Schema 1.1 
that you can do with
Schematron. The text is so hard I am loathe to make any hard statements, 
but here goes how it seems to me:

1) XSD 1.1 draft allows implementers to only provide a subset of XPath 
for the assertion tests. For example,
no parent:: predicates, no addition, etc. Note that the draft says "It 
is a consequence of this construction that attempts to refer, in an 
assertion, to the siblings or ancestors of E, or to any part of the 
input document outside of E itself, will be unsuccessful."
1a) XSD 1.1 draft does not allow the document() function (it seems) so 
it cannot do external code list validation
or inter-document testing. "The available documents 
<http://www.w3.org/TR/2007/REC-xpath20-20070123/#dt-available-docs> is 
the empty set."
1c) The rules for the validity and PSVI state of the branch being 
validated are typically complex: we now have a "partial post-schema 
validation infoset"
1d) This also impacts the ability to do complex integrity constraint 
tests. (However, XSD has a separate set of declarations
that cover some good kinds of unique/keyref constraints, of course.)

2) XSD 1.1 draft only has asserts but has no idea of reports.

3) XSD 1.1 draft does not use XPaths to provide context. The assertions 
are bound to types.

4) XSD 1.1 draft does not support any equivalent notion of patterns. 
Grammars are the model for selection,
which are less powerful than paths.

5) XSD 1.1 draft does not support any idea of phases. So it cannot be 
used for progressive validation,
nor does it support inline declaration of variants.

6) XSD 1.1 draft does not support any idea of variable for intermediate 
results. (The only variable is $value.)

7) XSD 1.1 draft does not support any idea of parameterized patterns.

8) XSD 1.1 draft does not support any idea of external parameters.

9) XSD 1.1 draft does not give any guidance for the use of annotations 
such as xs:documentation. Schematron
has a clear distinction between assertion text (a positive statement 
about what should be and why) and
diagnostic text (what was found, what could be done).

I would suppose there are others too.
> what would then be the raisons d'être for Schematron except for a nice
> validation report format and as an alternative to cumbersome XML
> Schema in some situations?
>   

The same as before. XSD is a schema language that stabs itself in its 
own foot:
 
 1) It tries to replace DTDs, yet without an entity mechanism (even as 
an external language) it cannot entirely
replace it.

 2) It uses an XML syntax, yet it is so complex and in any case defined 
in terms of components that we end
up being in the same boat with the XML syntax as with the DTD syntax: 
systems can write it easily, but it has
proved itself impractical for writing casual XSD-reading applications 
without severely subsetting it.

 3) It has an idea that there should be no subsets, yet this just 
results in circumlocutions like the data modeling
guidelines document from the W3C I referred to recently. And being so 
heavy weight, it makes implementers
unnecessarily conservative and stifles change and progress.

 4) It attempts justify its girth by being some kind of universal schema 
language, and yet it is incapable of
representing many important classes of documents.  And its complexity 
forces the use of XML-hiding tools,
which have the natural effect of creating an unskilled operator class 
who have great difficulty in tracking
down and resolving compatability problems when different parties use 
different tools.

 5) etc etc etc

There are many, many major classes of documents which XSD is utterly 
useless at. Look at SVG. Look at
the trend to use XML-in-ZIP instead of large unitary XML documents.  
Schematron is fine for these.

I welcome the addition of xs:assert.  But I think the whole basis of 
schema languages that don't make the human
data capture and human message communication issues central to be 
wrong-headed: XML's design
is largely based on taking the human factor very seriously and it would 
be great if XSD provided ways
where the xs:annotation/xs:appinfo could be used to generate dynamic 
messages in different languages.  There
is no reason why XSD could not be upgraded to invert it and put humans 
first. 

Now even if it did, it wouldn't obviate the need for Schematron (or 
languages like it.) Putting aside the
human factors, if  we judge a schema language on how effectively it 
allow traceable constraints and how
successfully it prevents the introduction of spurious constraints, then 
grammars in general have a problem.
Take two consecutive elements: in grammars as we have today there is no 
direct place to document why
one element should follow another: the documentation must go on one 
element or the other or on the parent.
(Sometimes xs:group might be used, at a pinch.)  In Schematron, you are 
not documenting elements but
patterns: you can be constraining links and relations between nodes just 
as much as the nodes themselves.

If your use of Schematron was to provide a few simple tests of immediate 
elements or attributes on top
of an XSD, such as simple co-occurrence constraints, to get a binary 
result, then xs:assert is a good
little addition  to XSD that can simplify your process and should have 
our support. 

However, that still leaves XSD as a language of little power, enormous 
complexity, interoperability problems,
no consideration of human factors, an unreadable standard, no support 
for progressive validation, no
support for workflows, no support for external code-lists, and which is 
based on a single-document
model of XML that is evaporating in the modern age of ODF, OOXML, SCORM, 
MARS and the other
XML-in-ZIP consumer formats.


Cheers
Rick
Follow-Ups:
- Re: [xml-dev] xs:assert and Schematron
  - From: "James Clark" <jjc@jclark.com>
References:
- xs:assert and Schematron
  - From: "Jesper Tverskov" <jesper.tverskov@gmail.com>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]