OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Can XML Schema be compiled to Schematron?

[ Lists Home | Date Index | Thread Index ]

 From: "Doug Ransom" <Doug.Ransom@pwrm.com>
 
> I have been thinking that it would be an interesting exercise to implement a
> compiler that transforms XML Schema into Schematron (probably in XSLT).  
 
You might start by processing your instance document with Francis Norton's
TypeTagger  XSLT scripts, which add the type names (of elements) to
an instance. That will simplify life a lot, because then the Schematron
schema has access to type information from its XPath 1.0. 

For translating an XML Schema into Schematron, you need to implement
three parts:
  1) Keyref and uniqueness. I think this would be very simple to implement.

  2) Datatypes.  Because datatypes in XML Schemas only derive each
  other by restriction, you can merely add an extra set of assertion for every
  restriction. For example, if you have a string type X restricted to 15characters,
   you can make an assertion to test that. If there is also a derived type X' 
   restricted to 10 characters, you can add an extra assertion for that too.
  The difficulty here would be that converting perl regular expressions to 
   the simple string operations in XSLT expressions might be a little 
  challenging.

  3) Structures.  The lion's share of content models can be expressed
   in any of the schema languages; the edge cases are imporant,
   but may be difficult. 

    One interesting approach is to generate every possible path in the
    document. Instead of slicing the structure up vertically, you are slicing
   it horizontally.
   
   So  for HTML you might make rules for all these paths
     /html
    /html/head
    /html/head/title[count(../title)=1]
    /html/head/meta
    /html/body[count(../body)=1]
    /html/body/p
    table/tr
    table/tr/td
    table/tr/th
    ul/li
    ol/li
    dl/dt
    dl/dd[preceding-sibling::dt]
    ...
  and then a rule that reports any element that does not fit one of the rules,
  or other patterns such as that  a/../a is an error.

   The interesting part, good for a bit of thought or a research project, would be
    to figure out how to automatically derive the correct tests (in the brackets)
    which give various kinds of position information. Also, the optimal length
    of each path is interesting:  presumably there would be efficiency
    considerations for different policies. 

   It is perhaps a distinctive enough approach that it could have a schema
   language all of its own ("chain" would be a good name), but I don't have 
   the energy. 

Cheers
Rick Jelliffe





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS