[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Victory has been declared in the schema wars ...
- From: Rick Jelliffe <rjelliffe@allette.com.au>
- To: mc@xegesis.org
- Date: Wed, 29 Nov 2006 16:34:47 +1100
Michael Champion wrote:
>
> Speaking of XSD 1.1 and Schematron, what do others think about their
> approach of defining their own constraint language based on Schematron
> concepts rather than taking an external dependency on Schematron?
Well, I think the idea of an XPath-based constraint language in general
is a no-brainer.
People who won't adopt a standard because it comes from ISO have
disappeared up their
own arseholes, they are too far from rational argument to worry about
(or W3C specs
for that matter.) But do such toroidal people actually exist? Lets
assume they don't :-)
People who want to have an XPath-based constraint language that has
significantly
different semantics or operations from Schematron certainly shouldn't
use Schematron.
For example, W3C XSD WG did the right thing by not adopting Schematron
(indeed, this was a point I made to them), since
one of the essenses of Schematron is the natural language assertion: the
grammar-
based schema languages all have the fundamental problem that they don't have
any mechanism for effectively communicating to humans diagnostics expressed
in terms of the problem domain and data graph: they can only give
generic messages in
terms of grammar theory, the XML tree and the specific element names. One
consequence of this is that as soon as the XML is hidden by some interface,
the canned validation messages (which are given in terms of the XML and
grammar)
become incomprehensible.
"Hiding the XML" often has the unintended consequence of making
validation messages
incomprehesible too: by remapping diagnostics, or often by re-inventing the
validation wheel and doing it in the User Interface code. Extra work,
double handling.
Schematron is really the only standard schema language that has actually
made
this issues its core: how do you
go from analyst-specified bullet-point specs of the rules to executable
code; how
can that executable code generate information expressed in domain terms
rather than markup terms with dynamic content, icons, etc suitable for being
displayed in a user interface, yet user-interface neutral.
I regularly see re-inventions of Schematron. I saw another new one just
yesterday.
The only one I have seen that was technically superior in some aspect
was the
XCSL (XML Constraint Specification Language) from Portugal: so ISO
Schematron
adopted their <let> variable (with their blessing) that is trivially
implementable
in XSLT of course.
People inevitably miss out on the two-part context/test split that
Schematron has:
this, which allows grouping of constraints (did anyone say "type"?),
potential
implementation improvements, easier to understand XPaths, and removes the
need for a for-each construct (in XPath 1 at least) for all real cases I
have seen.
Another thing is phases. Once issue with path-based constraint languages is
that you can easily end up with a storm of information, because in effect
validation happens in parallel: it doesn't stop at the first error. So
schematron
allows grouping of patterns into phases, to allow progressive validation:
lets validate all tables only, or lets validate that metadata exists as
the last
step even though it comes first in the doument, or lets check for typos in
namespaces first before we try to validate the elements, or lets now
download the XML data retrieved from a link in the document from
a DBMS URL.
There is one XPath-based constraint language that goes beyond Schematron;
XLinkIt is a commercial product that did this, by using a much more advanced
logic. I worked on expert systems in the early 90s, and I appreciate the
higher
order logics can be used to build really powerful systems, but I also
appreciate
that for sheer implementability and problem-solving, simple if-then style
predicate logic covers so many bases it is hard justify not erring on
the side
of simplicity.
So I am happy to defend the design of Schematron: I think it (largely
under the
influence of its users and developers) has matured into a design and
standard that has not
been bettered in the dozens of imitators. And, more than that, I think
it addresses
incredibly important issues (diagnostics, phases, etc) that expose the
other schema
languages and designs as being obese or underfed toys designed without
consideration
for the central position of humans in the chain.
Here's my quick test. I was one of the ones who pushed for XSD to have
determinate
outcomes: for example, the well-enumerated list of errors. It is a good
thing. However,
anyone who thinks that these kind of messages are remotely suitable for
end-users,
especially after being mediated through a user interface is fooling
themselves.
So that is one reason why I think the RELAX NG versus XSD debate is largely
flummery: of course XSD should be refactored into a RELAX NG-equivalent
core and a type-annotating outside layer, the RELAX NG people are correct
in saying that grammar-based schema languages can be refactored without
removing any
capability (or changing syntax necessarily). But just adding XPath
assertions to XSD (or
RELAX NG) , though good and better for modelling, misses the fundamental
diagnostical inadequacy of current schema languages.
Adopting RELAX NG will help people with problems relating to XSD's lack of
power in several areas. Though XSD 1.1 indeed does take a few steps
towards RELAX
NG, but still not nearly enough: indeed they take some steps back....for
example, it is
crazy for XSD 1.1 to just add weany, weedy and weakie (vini, vidi, vici)
XPath subset
constraints instead of allowing attributes in content models such as
RELAX NG has:
it is a hack, designed to be grafted onto existing products with minimum
pain.
But adopting RELAX NG won't alter the fundamental diagnostics issue. Nor may
taking on your own whizzbang home-made XPath-based constraint language,
because
engineers typically get caught up with the issue of how to add Xpaths to
types, rather
than the issue of how to make it easy and direct to express constraints
and get diagnostics.
The silicon- or character- focused engineers are the problem, not the
solution. Its not datahead
versus dochead that drives Schematron: its the user experience,
usability, user-friendliness,
user-centricism, interface-ability (is that a word?), non-technical
user control, minimization
of concepts, consolidation of skills (lay people can more easily learn
paths than grammars,
let alone UPA.)
If you're looking at schema languages, I think user-friendly diagnostics
is the big picture issue that
provides a different way to judge both XSD (and RELAX NG.) The other
nice thing about
Schematron, is that it can marketed as solving a different set of
problems than "schema" languages;
XSD can fall back to being a niche technology for WS-* and data-binding
hidden from users
and integrators.
Cheers
Rick Jelliffe
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]