Modeling ER schemas using Schematron

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: "Rick Jelliffe" <rjelliffe@allette.com.au>
To: xml-dev@lists.xml.org
Date: Thu, 30 Nov 2006 12:00:58 +1100 (EST)

[Roger asked why I think paths/Schematron is better than grammars/XSD.
Here is a more concrete example of how it can be more declarative and so
better for retargeting, more flexible for modeling, and how it doesn't
necessarily impose a different conceptual step the way that grammars can.]

ISO Schematron introduces a macro layer, "abstract patterns", that allows
higher-level specification of constraints in Schematron. (There is a
pre-processor available that works with Schematron 1.6 for this.)

This allows us to directly convert from, say, an ER diagram into
Schematron. You don't need to go through grammars; you can avoid the
ridiculous situation where you have one set of ER diagrams for your data
model, then you have to make another set of diagrams for the XML using
your XML schema IDE.

Plus, you can have a schema where you don't care which kind of
serialization strategy was used: you can support multiple strategies. For
example, in the following mini-example, fields with a one-to-one relation
can be nested or they can be linked using an ID-like mechanism. XSD is not
smart enough to allow this kind of alternative mechanism: this forces
people to make a decision about the serialization strategy: this creates
incompatibility because different people make different choices.

Using this mechanism, you can get a complete separation from the
declarative portions (which can have as much additional declarative
information as you like) and the operational/implementation code.

<sch:pattern is-a="ENTITY" >
   <sch:param name="address" value="" />
</sch:pattern>

<sch:pattern is-a="FIELD">
   <sch:param name="entity" value="address"/>
   <sch:param name="name" value="street"/>
   <sch:param name="type" value="xs:string"/>
   <sch:param name="required" value="true" />
</sch:pattern>

<sch:pattern is-a="FIELD">
   <sch:param name="entity" value="address"/>
   <sch:param name="name" value="town"/>
   <sch:param name="type" value="xs:string"/>
   <sch:param name="required" value="true" />
</sch:pattern>

<sch:pattern is-a="FIELD">
   <sch:param name="entity" value="address"/>
   <sch:param name="name" value="postcode"/>
   <sch:param name="type" value="xs:short"/>
   <sch:param name="required" value="false" />
</sch:pattern>

<sch:pattern is-a="ONE-TO-ONE-RELATION">
   <sch:param name="from" value="person"/>
   <sch:param name="to" value="address"/>
</sch:pattern>

How easy is that?  And, in particular, in what way is that more
complicated than XML Schemas?

This kind of declaration is very declarative, IYSWIM. Very easy to use for
other purposes.

--------------------

The implementation can be really complex, because it is not necessarily
something that ordinary users would be required to understand. They can
just fill in the forms for the various kinds of forms, like above.

The implementation of the abstract patterns might be something like this
(there is probably some casting required for strings and names, but this
is enough to give the idea):

<sch:pattern name="ENTITY"  abstract="true">
  <sch:rule context="/">
    <sch:assert test="true()">
      (We don't make an assertions about an entity.)
    </sch:assert>
  </sch:rule>
</sch:pattern>


<sch:pattern name="FIELD"  abstract="true">
  <sch:rule context=" $entity ">
    <sch:assert test=" boolean( $required ) = false or $name ">
    A <sch:name /> has a field <sch:value-of select=" $name "/>.
    (Fields are always serialized to XML as subelements.)
    </sch:assert>
  </sch:rule>
</sch:pattern>


<sch:pattern name="ONE-TO-ONE-RELATION"  abstract="true">
  <sch:rule context=" $from ">
    <sch:assert test=" $to or attribute::*[name() = $to ] ">
    There is a one-to-one relation from <sch:name /> and
    </sch:value-of select=" $to "/>  (This may be expressed in
    XML by using a subelement or by using an ID with the same
    name as the entity pointed to.)
    </sch:assert>

    <sch:assert test="count( $to | attribute::*[name() = $to ]) &lt;= 1 ">
    A one to one relation only allows a single child element or attribute.
    </sch:assert>

    <sch:assert test=" not(attribute::*[name() = $to ]) or
        //*[name() = $to]/attribute::*[name() = $from]
                     = current()/attribute::*[name() = $to ] ">
    If a one-to-one relation is serialized in XML using a link, then
    there should be a element somewhere in the document with the name
    of <sch:value-of select=" $to "/> which has an attribute called
    <sch:value-of select=" $from "/> which has the same value (e.g. an ID)
    as the value of the <sch:value-of select=" $to "/> attribute on the
    <sch:value-of select=" $from "/> element.
   </sch:assert>

  </sch:rule>

</sch:pattern>

As I mentioned before, providing the definitions is a guru/vendor task.
Using the abstract patterns is trivial form-filling.

----------------------------------

Also, note that because we have been so declarative, we could actually
convert the top definitions of our address schema to XSD even, at a pinch,
by simple transformation.

Cheers
Rick Jelliffe

Follow-Ups:
- Modeling ER schemas using Schematron [corrected]
  - From: "Rick Jelliffe" <rjelliffe@allette.com.au>

References:
- Re: [xml-dev] Victory has been declared in the schema wars ...
  - From: Rick Jelliffe <rjelliffe@allette.com.au>
- RE: [xml-dev] Victory has been declared in the schema wars ...
  - From: "Costello, Roger L." <costello@mitre.org>
- RE: [xml-dev] Victory has been declared in the schema wars ...
  - From: "Rick Jelliffe" <rjelliffe@allette.com.au>
- Re: [xml-dev] Victory has been declared in the schema wars ...
  - From: Philippe Poulard <Philippe.Poulard@sophia.inria.fr>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]