OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xml-dev] Who can implement W3C XML Schema ?

Hi Simon,

Simon St.Laurent wrote:

> On Wed, 2001-10-24 at 03:31, Rick Jelliffe wrote:
>>As for the complexity in the area of datatypes, I have not heard anyone say that
>>the facet-based approach is not the most elegant way to treat the problem.
>>What is the alternative:  Only simple types? No specification of 
>>type restrictions on an instance element?  Using little languages
>>rather than facets (Schematron's approach, b.t.w., and powerful but
> I've suggested that the entire descriptive type system and the PSVI it
> seems to expect on the other end is broken.  
> Regular Fragmentations[1] suggests a rather different processing model. 
> You still could have simple types and facets for the fragments, but
> they'd have a lot fewer parts and require less understanding than the
> Datatypes spec currently offers.

I been thinking about all this and regular expressions could definitely 
play a more important role in the definition of so called simple types.

The general framework of W3C XML Schema part 2 is, IMO, good. The main 
limitation and probably the main source of complexity is that, instead 
of defining a few simple generic mechanism, the rec is defining 10+ 
opaque primitive datatypes each of them using different algorithm which 
are only described as text.

If you look at the processing <disclaimer>as I understand it 
now</disclaimer>, you get a chain of several transformations between 
several "spaces" and these transformations which are "hard coded" could 
have been defined using a "datatype toolbox" which could have been 
specified in a language independent fashion like XPath has been defined.

The first transformation is between what I call the "parser space" (what 
a XML 1.0 parser sends to an application) and what W3C XML Schema calls 
the lexical space.

This transformation is currently only dealing with additional whitespace 
processing and it regular fragmentations would have been, IMO, really 
beneficial at this level.

There is then a second transformation which is converting the lexical 
space into value spaces and, here again, instead of defining a dizain of 
"opaque" transformations, these transformations could have been defined 
using a library of binding functions (such as toInt, toDate, toList, 
toArray, namespaceURI, localName, ...) which could have been combined 

A common mechanism might be used for the two transformations if the 
regular expressions are defined through "regexp" functions and these two 
steps might even be defined in a coherent way.

With such a framework, a single primitive type would be enough 
(anySimpleType) and all the predefined datatypes could be defined 
programatically using the function library.

This would have allowed to meet the needs for internationalization and 
for new types such as arrays which are not covered by W3C XML Schema.

This might be seen as more complex, but I think, on the contrary, that 
it's always simpler to describe a few generic concept and use them to 
build complex structures than to describe these complex structures with 
plain text.

When these types have been defined (either predefined of by the user), 
restricting by facets is OK. And having this library of functions at 
hand would have allowed to define the facets more generically as assertions.

Sorry for the length of this email!


> I'll be talking more about this at XML 2001, and hope to have more to
> show by then.  My presentation from the Extreme Markup Language
> conference [2] covers some of it. 
> [1] - http://simonstl.com/projects/fragment
> [2] - http://simonstl.com/articles/regfrag/

Rendez-vous à Paris pour le Forum XML.
Eric van der Vlist       http://xmlfr.org            http://dyomedea.com
http://xsltunit.org      http://4xt.org           http://examplotron.org