Lists Home |
Date Index |
email@example.com (Elliotte Rusty Harold) writes:
>>So are you against any schema technology on principle? If not, why
>>not? How exactly does the thing described by a schema differ from
>>a data model?
>That's a non-sequitur. Why would you think Simon is opposed to any
>schema technology on principle just because he's opposed to sharing
>data models? Allow me to rephrase:
>Simon: It's a bad idea for school children to share combs. It spreads
>Tyler: So are you opposed to all hair care products in principle?
Thanks - I think Elliotte's rephrasing is strikingly accurate, provided
we ignore that apparently public health people are re-examining how lice
Looking slightly beyond that, however, I think the syntax/model issue
has a lot to with how I evaluate different schema technologies. (This
may also help answer Tyler's followup, "How exactly does the thing
described by a schema differ from a data model?")
W3C XML Schema feels to me like an effort to create a new data model
description language that happens to be connected to XML syntax. The
emphasis on types throughout, the notion that an XML tree could/should
be annotated with additional information about what the 'real' type is,
and type machinery that means adding an attribute moves a declaration
from a simple to a complex type are all strong signals that data models
are the focus here. While W3C XML Schema has lots of problems, I
suspect this choice was the founding catastrophe for the rest of the
RELAX NG does operate on a data model to some degree, but it doesn't
feel obsessed with creating data models. RELAX NG feels to me like it
is exclusively about creating patterns which can be tested against XML
documents. Does a document fit this pattern, or not? Where does it
fail? To me, that's much more coherent with the notion that XML
documents contain structure expressed syntactically. The math is
crucial, but doesn't interfere - much the way that the math behind
regular expressions doesn't abstract them away from their work of
matching patterns in text.
Schematron, similarly, operates on the XPath data model, and does some
very nifty things with the operators XPath provides, but it doesn't
create new models. I wish Schematron was a bit more syntactic - it
would be very nice to have Schematron schemas issue warnings and reports
about things like entity usage - but I also recognize where it came
from, and think it's a pretty coherent tool for analyzing structures
expressed through syntax.
I guess it's fair to say that to me:
* The things described by W3C XML Schemas are in fact data models and
only tangentially XML documents. That's as good cause as any for
discarding WXS and recommending that people avoid it.
* The things described by RELAX NG schemas are XML documents or
simulacra thereof, and the combination of that with a sane mathematical
foundation and a convenient compact syntax are as good cause as any for
using RELAX NG to describe XML documents.
* Schematron can probably be applied to nearly any kind of data, and its
similarity to unit testing had appeal at a presentation I gave on
Tuesday night, but it happens to be a great way to test for certain
conditions in XML documents without creating much data model overhead.
So yes, some of these things create or use data models intended to be
shared. The less data model involved, the less cumbersome the solution,
at least in my experience. The deeper the shared data model, the
stronger the poison.
 - Yes, it's possible to use only anonymous types, per Tim Ewald.
While I applaud his work, it really goes against the grain of WXS.
There are also features in WXS which feel like they apply more directly
to XML documents in a RELAX NG or Schematron way, notably keys and maybe
substitution groups, but in the WXS context they seem mostly to muddy
the waters further.