OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Namespaces, W3C XML Schema (was Re: ANN: SAX Filters forNamespaceProcessing)



From: "Simon St.Laurent" <simonstl@simonstl.com>

> On 21 Aug 2001 12:48:12 -0500, Bullard, Claude L (Len) wrote:
> > Has anyone written a concise description of the limits 
> > of the validation power of XML Schemas, say one that 
> > considers the Schematron assertions?  

> Murata Makoto, Dongwon Lee, and Murali Mani.  "Taxonomy of XML Schema
> Languages using Formal Language Theory."  in Conference Proceedings,
> Extreme Markup Languages 2001, p. 153-166. (Abstract:
> http://www.extrememarkup.com/extreme/2001/wednesday.htm#3)
> 
> It examines DTD, W3C XML Schema, DSD, XDuce, RELAX Core, and TREX, not
> Schematron per se, but I would expect Schematron to have expressive
> powers in the RELAX/TREX zone, perhaps even more.  Schematron is
> intriguingly off the charts for a lot of this.

"Expressive power" is not always an important criterion.

For example, even if Schematron and RELAX NG were not of equal expressive 
power (i.e. if every document accepted as valid/invalid by a Schematron schema 
could also be similarly accepted valid/invalid using a RELAX NG schema and vice 
versa) but a small schema in one explodes into a large schema in
the other, who cares which is more expressive?  

The important thing is that we have different major ways to cut the cake (e.g.,
grammars, path rules, ordering rules, lexical type, datatyping, microparsing,
examplars, etc) .  

These different ways may have properties which
are more relevant to adopters than the expressive power.  For example,
Schematron lets you express that different constraints have different statuses
(they may be important during different phases, or they may belong to different
patterns).  RELAX may let us perform set operations on schemas. 

Or the different ways may have performance implications: Schematron would be
difficult to implement in a stream so might be inappropriate for high-volume
servers (unless the server already constructs each document in a tree along the
way, e.g. a webserver using XSLT!), but it does not suffer combinatorial 
explosions (though there are XPath constructs that could be avoided for optimal
performance).
  
Another criterion might be whether the schema language's validation model
allows users to detect systematic errors (e.g. ones which might be fixed
using a global operation) rather than just one-at-a-time. E.g. rather than
saying "this person needs a name" can it say "100 persons need names".
Does the validation model force you to work through the document in
document order, or is a division of labour possible, where you only
work on certain elements regardless of whether previous elements in
the document are invalid? 

Cheers
Rick Jelliffe