[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] xs:assert and Schematron
- From: Rick Jelliffe <rjelliffe@allette.com.au>
- To: Jesper Tverskov <jesper.tverskov@gmail.com>
- Date: Tue, 16 Dec 2008 14:56:56 +1100
Jesper Tverskov wrote:
> I have tested xs:assert in the working draft of XML Schema 1.1 as
> implemented in Saxon.
>
> It surprises me that we now can do anything in XML Schema directly,
> that we can do with Schematron. Except for Schematron's user-defined
> error messages and a nice report format.
>
I think xs:assert solves the kind of problem faced by people have who
find XSD useful
in the first place (which is reasonable), however there is a large world
of people with
documents and problems that don't fit that, and Schematron often
provides a good fit for them.
(I don't see that XSD can ever be made to fill Schematron's shoes, nor
vice versa.)
> So this is my question. Is there any testing we can do with Schematron
> that we can't do in XML Schema 1.1 directly?
>
It is not at all true that you can do anything with draft XML Schema 1.1
that you can do with
Schematron. The text is so hard I am loathe to make any hard statements,
but here goes how it seems to me:
1) XSD 1.1 draft allows implementers to only provide a subset of XPath
for the assertion tests. For example,
no parent:: predicates, no addition, etc. Note that the draft says "It
is a consequence of this construction that attempts to refer, in an
assertion, to the siblings or ancestors of E, or to any part of the
input document outside of E itself, will be unsuccessful."
1a) XSD 1.1 draft does not allow the document() function (it seems) so
it cannot do external code list validation
or inter-document testing. "The available documents
<http://www.w3.org/TR/2007/REC-xpath20-20070123/#dt-available-docs> is
the empty set."
1c) The rules for the validity and PSVI state of the branch being
validated are typically complex: we now have a "partial post-schema
validation infoset"
1d) This also impacts the ability to do complex integrity constraint
tests. (However, XSD has a separate set of declarations
that cover some good kinds of unique/keyref constraints, of course.)
2) XSD 1.1 draft only has asserts but has no idea of reports.
3) XSD 1.1 draft does not use XPaths to provide context. The assertions
are bound to types.
4) XSD 1.1 draft does not support any equivalent notion of patterns.
Grammars are the model for selection,
which are less powerful than paths.
5) XSD 1.1 draft does not support any idea of phases. So it cannot be
used for progressive validation,
nor does it support inline declaration of variants.
6) XSD 1.1 draft does not support any idea of variable for intermediate
results. (The only variable is $value.)
7) XSD 1.1 draft does not support any idea of parameterized patterns.
8) XSD 1.1 draft does not support any idea of external parameters.
9) XSD 1.1 draft does not give any guidance for the use of annotations
such as xs:documentation. Schematron
has a clear distinction between assertion text (a positive statement
about what should be and why) and
diagnostic text (what was found, what could be done).
I would suppose there are others too.
> what would then be the raisons d'ĂȘtre for Schematron except for a nice
> validation report format and as an alternative to cumbersome XML
> Schema in some situations?
>
The same as before. XSD is a schema language that stabs itself in its
own foot:
1) It tries to replace DTDs, yet without an entity mechanism (even as
an external language) it cannot entirely
replace it.
2) It uses an XML syntax, yet it is so complex and in any case defined
in terms of components that we end
up being in the same boat with the XML syntax as with the DTD syntax:
systems can write it easily, but it has
proved itself impractical for writing casual XSD-reading applications
without severely subsetting it.
3) It has an idea that there should be no subsets, yet this just
results in circumlocutions like the data modeling
guidelines document from the W3C I referred to recently. And being so
heavy weight, it makes implementers
unnecessarily conservative and stifles change and progress.
4) It attempts justify its girth by being some kind of universal schema
language, and yet it is incapable of
representing many important classes of documents. And its complexity
forces the use of XML-hiding tools,
which have the natural effect of creating an unskilled operator class
who have great difficulty in tracking
down and resolving compatability problems when different parties use
different tools.
5) etc etc etc
There are many, many major classes of documents which XSD is utterly
useless at. Look at SVG. Look at
the trend to use XML-in-ZIP instead of large unitary XML documents.
Schematron is fine for these.
I welcome the addition of xs:assert. But I think the whole basis of
schema languages that don't make the human
data capture and human message communication issues central to be
wrong-headed: XML's design
is largely based on taking the human factor very seriously and it would
be great if XSD provided ways
where the xs:annotation/xs:appinfo could be used to generate dynamic
messages in different languages. There
is no reason why XSD could not be upgraded to invert it and put humans
first.
Now even if it did, it wouldn't obviate the need for Schematron (or
languages like it.) Putting aside the
human factors, if we judge a schema language on how effectively it
allow traceable constraints and how
successfully it prevents the introduction of spurious constraints, then
grammars in general have a problem.
Take two consecutive elements: in grammars as we have today there is no
direct place to document why
one element should follow another: the documentation must go on one
element or the other or on the parent.
(Sometimes xs:group might be used, at a pinch.) In Schematron, you are
not documenting elements but
patterns: you can be constraining links and relations between nodes just
as much as the nodes themselves.
If your use of Schematron was to provide a few simple tests of immediate
elements or attributes on top
of an XSD, such as simple co-occurrence constraints, to get a binary
result, then xs:assert is a good
little addition to XSD that can simplify your process and should have
our support.
However, that still leaves XSD as a language of little power, enormous
complexity, interoperability problems,
no consideration of human factors, an unreadable standard, no support
for progressive validation, no
support for workflows, no support for external code-lists, and which is
based on a single-document
model of XML that is evaporating in the modern age of ODF, OOXML, SCORM,
MARS and the other
XML-in-ZIP consumer formats.
Cheers
Rick
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]