OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] W3C Schema: Resistance is Futile, says Don Box

[ Lists Home | Date Index | Thread Index ]


> > > In XML, typing specifies validation algorithms.
> >
> > Hmm... well it also allows software to unpack XML instances into native
> > storage in a sensible way without being savvy to the internals of the
> > applications that generated or will use the data.

> SOAP toolkits wouldn't be anywhere near as appealing to the average
> developer if it wasn't for this "feature".

I agree this is an important application.  (I'll refer to this as
"data-binding".)

> As much as I love the simplicity
> and elegance of RELAX NG, it's simply not suitable for generating
> language-specific bindings that can both generate and parse XML.

Could you expand on this?  I believe Relaxer [1] supports data-binding for
RELAX and has some experimental support for RELAX NG.

I can think of a couple of reasons why people might think RELAX NG
unsuitable for data-binding.

1. Type assignment.  The only semantics that RELAX NG defines for a schema
is whether or not it matches an instance.  Unlike XSD, it doesn't define a
mapping between elements and attributes in the instance and particular
element and attribute patterns in the schema.

There are a number of reasons that the RELAX NG spec doesn't address this.

(a) Although there are many applications that need type assignment, there
are also many applications that don't.

(b) There's no one single right way to do type assignment.  You can choose
to impose various different constraints on the schema to make ambiguities
impossible, or you can specify rules to be used to resolve ambiguities.  XSD
uses a rather ad hoc mixture.  It imposes constraints in the form of the
element declarations consistent and unique particle attribution constraints,
but for unions of simple types, it resolves ambiguities by preferring the
first match.

(c) Type assignment has a cost.  If you impose constraints, then those
constraints have to specified (which increases the complexity of the spec)
and enforced (which increases the complexity of the implementation); users
have to learn them (which decreases ease of learning); users have to write
their schemas so as to satisfy the constraints (which decreases ease of use)
and certain things are no longer expressible.

These considerations lead me to the conclusion that it is better to deal
with type assignment in a modular way as a separate specification. I believe
this is possible and practical. Concretely, I envisage having a "RELAX NG
Type Assignment" spec that standardizes one or more annotation attributes
that can assert that the schema satisfies particular constraints which
facilitate type assignment.  A processor that implemented this spec would be
able to report whether or not that the schema satisfied the asserted
constraint.

Dealing with type assignment in this way allows those who do not care about
type assignment not have to pay for it. It also allows there to be multiple
ways of doing type assignment that make different tradeoffs between
expressiveness and runtime performance.

2. Inheritance.  XSD provides a complex type hierarchy whereas RELAX NG
doesn't. Now, using inheritance in XSD isn't compulsory.  I can write
schemas using model groups and attribute groups without any use of
restriction or extension (except for the trivial, implicit extension of the
ur-type).  If XSD data-binding implementations can handle such schemas which
don't make any use of inheritance, the absence of a complex type hierarchy
cannot be an insuperable barrier to data-binding implementation.  It seems
to me that the benefit of having the complex type hierarchy is that it
allows a data-binding implementation produce better, more natural class
definitions.

It also seems to me that any production data-binding implementation is going
to want to provide a way for the user to control the way the schema gets
mapped into classes, for example, to control the package name, the class
name or the way non-trivial content models get handled.  Typically, I guess
annotations would be used for this.  So, here's my question: why can't
annotations be used in the same way to allow the user to control the
inheritance hierarchy in the generated classes.  It doesn't seem to me that
such annotations need to be very complicated; for example, you could have an
annotation on <ref> that said this <ref> was a reference to a base class.
As in the type-assignment case, if there's a need, such annotations can be
standardized in a separate spec.

In conclusion, my view is that although XSD out of the box provides much
more support for data-binding than RELAX NG, nonetheless RELAX NG provides a
suitable basis on which to build support for data-binding.  The RELAX NG
approach gives a lot of flexibility and avoids imposing costs on those who
do not use XML just as a serialization format for C# and Java. However, I
have to admit that until such time as the kinds of annotation I mentioned
above get standardized, RELAX NG provides less interoperability than XSD for
data-binding.

James

[1] http://www.asahi-net.or.jp/~dp8t-asm/java/tools/Relaxer/







 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS