xml-dev - XML Schema considered harmful?

XML Schema considered harmful?

[ Lists Home | Date Index | Thread Index ]

To: <xml-dev@lists.xml.org>
Subject: XML Schema considered harmful?
From: "Michael Leditschke" <mike@ammd.com.au>
Date: Wed, 5 Jun 2002 14:32:53 +1000
Importance: Normal
In-reply-to: <3CFD1523.8000703@textuality.com>

> Also, it's not often that James opens the ports, rolls out all the
> cannons, and lets go with a full broadside, but you don't want to miss
> it when it happens:
>
>   http://www.imc.org/ietf-xml-use/mail-archive/msg00217.html
>

James states a number of reasons to prefer RELAX NG. I was somewhat
hesitant to write given his standing, but in the areas of XML Schema
use on which he touches, my experience leads me to different conclusions.


From point 2:

"I often hear people say: "It doesn't really matter that the spec W3C
XML Schema Rec is so hard to understand; only W3C XML Schema
implementors need to do this". I think this is misguided.  People who
want to be sure they have understood exactly what a particular W3C XML
schema means also have to understand the W3C XML Schema Rec."


In terms of the complexities of specs, yes XML Schema Part 1 makes
difficult reading but the Primer, Part 0, is quite readable and, to
their credit, was updated with each release of the spec. It covers the
ground and I have only occasionally had to refer to Part 1, despite
designing schemas using a large percentage of the supported constructs.

For the people James alludes to who don't want to write validators but want
to know what's valid, section 6 of the RELAX NG spec to me is pretty
daunting. It may be a more solid theoretical foundation but is it
a lot more readable to the audience he is targetting?



From point 4:

"There is no way to say [in XML Schemas]
that either attribute X or attribute Y is allowed or that either
attribute X or element Y is allowed.  In my experience, this sort of
constraint is extremely common in XML grammars."


The level of co-constraint checking offered by RELAX NG is an improvement
on their absence in XML Schema but is this more than leap-frogging? I would
also add that the situations where I needed co-constraints were often
content based, e.g. if element X contains "5", attribute Y should be
present.
I may have missed it, but the support in RELAX NG seems, by the nature of
RELAX NG, purely structural. I assume I will need to add Schematron to the
mix,
which is the same situation as with XML Schema currently.



From point 7:

"In W3C XML Schema there is no way to specify what is allowed as the
root element."


I've probably completely missed the point here, but doesn't an XML Schema
that only has one global element achieve the above? Maybe its a matter of
semantics but that's how it's panned out in practice for me thus far.

Where more than one element is valid as the root of a document, I have
declared the corresponding global elements.




From point 8:

"There is no way in a W3C XML Schema to prohibit the instance from
containing xsi:schemaLocation attributes.  Indeed, this is also the
case for other xsi attributes: there is no way to prevent the document
containing xsi:type attributes.  The use of W3C XML Schema infects the
grammar you are defining. If you want a closed grammar that only
allows specific attributes not including the xsi attributes, you
cannot express that in W3C XML Schema.  RELAX NG has no such magic
attributes."


The language used suggests the xsi:type mechanism of
XML Schemas to be a bad thing. I don't see it that way.
xsi:type can provide a modular way to validate alternate structures
carried by a common container without recourse to ungainly choice
expressions. This is particularly important if the control of the
container schema is separate from those attempting to define new
content models. This was the situation in a project I recently
worked on.

In my reading of the RELAX NG specification and tutorial, I didn't see
anything that offered an equivalent to this. I'd be interested in
seeing what RELAX NG can offer to allow a comparison.



And again from point 8:

"This use of schemas is easily undermined by documents that use
xsi:schemaLocation.

Another reason is that this leads to interoperability problems.  Its
use is not mandated by the XML Rec: it's just a hint.  Yet, in some
implementations, this is the only way to specify the schema to use to
validate the document."


This is a two sided coin. Being a hint, it is possible to use the
attribute if the situation lends itself. I don't see a large difference
between RELAX NG demanding the validator be given a schema and an instance
and XML Schema allowing, as an option, for the validator to use the
schemaLocation attribute. Of the processors I have used that initially
did have the constraint mentioned, that constraint has been removed in
more recent versions.




Don't get me wrong - I don't receive regular brown paper envelopes with
W3C in the return address, and I'm not saying XML Schema hasn't got warts,
but its there and supported and to me, its not the **HUGE** conceptual
and learning leap it seems to be painted as in this newsgroup. It achieved
my 80% and got the project in on time. In the process a number of other
organisations had to climb the same learning curve and got there.

James is emphatic, and that is only natural, but his arguments paint
issues as black and white (XML Schema = bad, RELAX NG = good) and my
experience with XML Schema suggests shades of grey.

To my mind, the bigger issue to decide is how many schema langauges
the IETF want appearing in RFCs. Simply allowing both means that RFC
readers have to learn both. And since RELAX NG focusses on structure,
what will be used to express content based co-constraints? Perhaps it
would be better to be arguing for DSDL.


Regards
Michael

Follow-Ups:
- Re: [xml-dev] XML Schema considered harmful?
  - From: John Cowan <jcowan@reutershealth.com>
- Re: [xml-dev] XML Schema considered harmful?
  - From: "James Clark" <jjc@jclark.com>
- Re: [xml-dev] XML Schema considered harmful?
  - From: Murali Mani <mani@CS.UCLA.EDU>
- Re: [xml-dev] XML Schema considered harmful?
  - From: "Rick Jelliffe" <ricko@allette.com.au>
- Re: [xml-dev] XML Schema considered harmful?
  - From: Eddie Robertsson <erobertsson@allette.com.au>

References:
- Interesting mailing list & a rare broadside
  - From: Tim Bray <tbray@textuality.com>

Prev by Date: Re: [xml-dev] Interesting mailing list & a rare broadside
Next by Date: RE: [xml-dev] Interesting mailing list & a rare broadside
Previous by thread: Re: [xml-dev] Interesting mailing list & a rare broadside
Next by thread: Re: [xml-dev] XML Schema considered harmful?
Index(es):
- Date
- Thread