XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Will the next version of XML Schema have aschema-for-schemas that is standalone (no English prose needed to describeconstraints in the language)?

On Thu, 2007-10-18 at 11:29 +0100, Anthony B. Coates (XML-Dev) wrote:
> Rick, is that historically correct?  Was anyone in the XML Schema Working  
> Group ever arguing *for* content models that depend on attribute values or  
> similar?  I wasn't there, but W3C XML Schema (to me) has always looked  
> like a schema language which takes a lot of inspiration the way data is  
> structured in object-oriented languages (and to a lesser extend,  
> relational databases).

Yes. Me. See
http://lists.w3.org/Archives/Public/www-xml-schema-comments/1999AprJun/0061.html

8 and half years ago, and I still don't know if it on their radar. 

One of the troubles then was that there was no implemented schema
language that used grammars that did it, then, and the WG was not
theory-friendly but implementation-friendly: the WG's method was to
consolidate the ideas from the various trial balloons that members had
implemented, if you pardon my euphemistic usage of "consolidate".

Schematron didn't count, and many of the experts on the WG did have a
lot of treasured database and OO experience to contribute. This is why
XSD is used for everything except actual validation nowadays. It has
lots of extra machinery for bundling up constraints into higher-level
components and naming the kinds of things that parameter entities can
do, but basically no attention was paid to being more powerful than DTDs
at the grammar level: this is why people feel it has disappointing bang
per buck. (I am not referring to the issue of what class of TCS grammar
is used, btw.)

My little 1999 paper "Validate This: Content Models on Different
Targets" has the basic idea: you should be able to include any type of
node in the DOM in the content model. RELAX NG ended up adopting a
simplified version where they allowed elements, attributes or particular
token values to be particles. I doubt if anyone on the XSD WG bothered
to read or understand the paper (1999 was not a good year for my prose)
and they certainly weren't listening to Murata-san by that stage. (An
interesting idea in it is that you can use the same system to validate
IDREF, since the presence of an IDREF requires an ID elsewhere.)

I extended the idea in the paper "Axis Models & Path Models: Extending
DTDs with XPaths" at http://xml.ascc.net/en/utf-8/validaxis.html has the
discussion proposal.  In fact, I suggested that the particles in a
grammar could in fact be any XPath, not just an element particle or an
attribute particle: "path models". 

That paper also has "Axis models" which use grammars, but along any
axis, not just following-sibling axis.  So a simplified schema on the
child axis for HTML might be  
	(html, 
		(head, 
			(meta|object|script)) | 
		(body, 
			((p, 
				(b|i))|
			(ul, 
				li, 
					(b|i))| 
			(ol, 
				li, 
					(b|i)))

For a database dump kind of schema, where there are no positional
dependencies, the axis model can give the entire schema in a single
production, along the child axis. 

(For anyone interested, I also developed a third approach, using the
document-order axis, using a simplified grammar, and allowing partial
ordering and missing sections, as the "Hook" schema language, a toy
language that was the smallest schema language until Eric's Examplotron.
http://xml.ascc.net/hook/)

I also argued that we needed to support arbitrary syntaxes for simple
datatypes: how to map notations to standard types was the key. This is
because the job of markup is to annotate data in the form that the user
is familar with, not always just for machine-to-machine transfer in
inhuman formats. This is the approach that ISO DTLL has taken. (I
remember making the point at XML 200? at the town-hall meeting on
Schemas where Schematron was able to specify many more of the document
constraints than others but was not declared the winner despite the
objective evidence, hrmmmph.)

My earliest suggestion was "Notation Schemas"
http://lists.w3.org/Archives/Public/www-xml-schema-comments/1999AprJun/0030.html 
now archived at http://xml.coverpages.org/jelliffe-notation19990513.html


> That is to say, I didn't think it was a case of "they did it this way  
> because they were forced to", I always thought it was more "they did it  
> this way because they chose to", which is a different kettle of fish.

No, the simpleContent and complexContent elements were quite late
additions made as hacks, IIRC.

Cheers
Rick Jelliffe



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS