OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] XML Schema considered harmful?

[ Lists Home | Date Index | Thread Index ]

From: "Michael Leditschke" <mike@ammd.com.au>


> In terms of the complexities of specs, yes XML Schema Part 1 makes
> difficult reading but the Primer, Part 0, is quite readable and, to
> their credit, was updated with each release of the spec. It covers the
> ground and I have only occasionally had to refer to Part 1, despite
> designing schemas using a large percentage of the supported constructs.

I had the experience of being *very* familiar with the XML Schema specs,
then going away for a few months.  When I returned, I found them quite
difficult to fathom.  There have been several times when I have not been 
able to answer user's (of our validator) questions and have had to rely on 
another Schema expert here.

The issue is not whether it is possible to become a fulltime expert in XML Schemas;
the issue is how much protocol designers should be required to cope with,
and whether IETF should support plurality or be exclusive.

IETF has so far been build on making layers to support plurality, allowing
protocols to thrive on their own.  XML Schemas is monolithic and badly
architected: it will be very difficult to upgrade the bits that are incomplete
(keys and datatypes) because of this. 
  
> I may have missed it, but the support in RELAX NG seems, by the nature of
> RELAX NG, purely structural. I assume I will need to add Schematron to the
> mix, which is the same situation as with XML Schema currently. 

Thanks for the plug! However there is (at least) one significant difference:
Schematron has not been designed with streaming implementations in mind
(and I am not aware of any streaming implementations):  a schema language 
that requires a DOM be built is not suitable for high-speed transaction validation 
over the Web, which is what we are talking about.  

Now, I am aware of people who have used Schematron for testing incoming
pages and generating custom pages to return to the user to ask them
for missing or incorrect information. But that is a different area. 

> I've probably completely missed the point here, but doesn't an XML Schema
> that only has one global element achieve the above? Maybe its a matter of
> semantics but that's how it's panned out in practice for me thus far.

But then you cannot use subsititution groups:  this is the kind of complexity 
that James is talking about I think--the complexity when using one
feature makes another disappear arbitrarily. 
  
> Don't get me wrong - I don't receive regular brown paper envelopes with
> W3C in the return address, and I'm not saying XML Schema hasn't got warts,
> but its there and supported and to me, its not the **HUGE** conceptual
> and learning leap it seems to be painted as in this newsgroup. It achieved
> my 80% and got the project in on time. In the process a number of other
> organisations had to climb the same learning curve and got there.

Were these projects IETF protocols? and  are you are an XML or schema
expert or, as we can expect IETF people to be,  are you only using XML 
because it will be more convenient than rolling your own syntax and you
are not an expert?   If I were developing a protocol, I would be
take some convincing that XML Schemas was not overkill for my
requirements.   

> James is emphatic, and that is only natural, but his arguments paint
> issues as black and white (XML Schema = bad, RELAX NG = good) and my
> experience with XML Schema suggests shades of grey.

But it is not James who is being black and white: it is the draft RFC wanting to 
ban the use of RELAX NG! (and, Schematron or the DSDL effort for that matter!)
 
> To my mind, the bigger issue to decide is how many schema langauges
> the IETF want appearing in RFCs. Simply allowing both means that RFC
> readers have to learn both. And since RELAX NG focusses on structure,
> what will be used to express content based co-constraints? Perhaps it
> would be better to be arguing for DSDL.

DSDL is an ISO standard in several parts, and I think the ISO WG involved
is very keen to not repeat the mistakes of XML Schemas w.r.t premature
standardization.  So the technologies that are mature (now RELAX NG,
shortly Schematron) are being standardized. 

In any case, it seems that many people who are cowed by XML Schemas 
actually write their Schemas as DTDs then convert them using an automated
tool.  I used James' dtdinst program last night for the first time 
(to convert the EAD DTD into RELAX NG) and I found it was
excellent. If there is a large class of users who just learn XML and
are content to automatically convert, they have no requirement that
a single schema language be mandated. I don't think the argument
that people will be confused by multiple schema languages holds water: 
some people will be confused by XML Schemas anyway and 
turn to simplifying tools (e.g. writing in DTDs) or different interfaces.

The best way is to try both schema languages and to get a feel for their
different capabilities.

Clearly XML Schemas has  innumerable nice features for transfering data
between backend database systems by big business.  Clearly RELAX NG 
has nice features for multimedia languages and documents.  But are
IETF protocols more like big-business data transfers or like multimedia
languages?  

It would make more sense for the RFC to merely say something like
this

"Standard schema languages (E.g. ISO RELAX NG or W3C XML Schemas)
should be used in preference to proprietary or non-standard languages. 
Schema languages should be used conservatively: exotic or difficult or 
badly-described features may be badly implemented or used incorrectly 
or be difficult to diagnose."

Cheers
Rick Jelliffe




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS