OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: XML Schemas: Best Practices

[ Lists Home | Date Index | Thread Index ]
  • From: "Roger L. Costello" <costello@mitre.org>
  • To: xml-dev@lists.xml.org
  • Date: Sun, 22 Oct 2000 15:37:19 -0500

Hi Folks,

I have been thinking for the last couple days about Toivo's message
regarding the Object Oriented (OO) principles:

-minimize dependencies between programs (i.e., minimize "coupling"
between programs)

- group together related pieces of information (i.e., maximize
"cohesiveness" of programs).  

I can certainly understand how these principles would help to create
programs which can be independently tested and reasoned upon.

For our discussions, the concern is: "how do these principles apply to
XML and schema design?"

The first thing that occured to me was, "to what do these OO principles
apply - to the instance document, or to the schema, or to both?"  That
is, should we write schemas to: 

   - minimize coupling of components within instance documents, or 
   - minimize coupling of the schema components themselves, or
   - minimize both?

(similarly for maximizing cohesion)

Let's consider an example:

   <Book>
      <Title>Illusions</Title>
      <Author>Richard Bach</Author>
      <Cost>12.85</Cost>
   </Book>

   <Person>
      <Name>Richard Bach</Name>
   </Person>

Here we see some data, structured using XML syntax.  If we think of Book
and Person as "objects", then these objects are very loosely coupled
(not coupled at all in fact), and the relevant data is nicely bundled
within each object (i.e., high cohesiveness).

Thus the instance document has minimal coupling and maximal cohesiveness
of its components. 

Let's turn to defining Book and Person in a schema. Here's how to
express them using the designs we created last week:

Book/Person Schema using the Russian Doll Design:

   <element name="Book">
      <complexType>
         <sequence>
            <element name="Title" type="string"/>
            <element name="Author" type="string"/>
            <element name="Cost">
                <simpleType>
                   <restriction base="decimal">
                      <scale value="2"/>
                   </restriction>
                </simpleType>
             </element>
         </sequence>
      </complexType>
   </element>

   <element name="Person">
      <complexType>
         <sequence>
            <element name="Name" type="string"/>
         </sequence>
      </complexType>
   </element>
   
Recall that the Russian Doll design mirrors the instance document
structure; thus in our schema we see two self-contained element
declarations, which mirror the two self-contained elements in the
instance document.

Book/Person Schema using the Venetian Blind Design:

   <simpleType name="money">
      <restriction base="decimal">
         <scale value="2"/>
      </restriction>
   </simpleType>

   <complexType name="Book">
      <sequence>
         <element name="Title" type="string"/>
         <element name="Author" type="string"/>
         <element name="Cost" type="cat:money"/>
      </sequence>
   </complexType>

   <complexType name="Person">
      <sequence>
         <element name="Name" type="string"/>
      </sequence>
   </complexType>

   <element name="Book" type="cat:Book"/>
   <element name="Person" type="cat:Person"/>

Recall that the Venetian Blind design spreads out the components into
type definitions and then reuses the types.

The above instance document validates against both of these schema
designs. Thus both designs enable us to create instance documents which
exhibit minimal coupling and maximal cohesiveness. 

Now let's consider how the two schemas measure up with respect to
coupling and cohesiveness:

- Both designs bundle together Title and Author into a Book 
  object.  Both bundle Name into a Person object.  The Venetian
  Blind has an additional object - money.  The money object 
  bundles together the features of (U.S.) money. Thus, both 
  schemas exhibit a high degree of cohesiveness.

- The Russian Doll design creates standalone Book and Person 
  objects. The Venetian Blind design creates a Book type 
  which depends upon the money type.  The Book element 
  depends upon the Book type.  The Person element depends 
  upon the Person type.  Clearly, the Venetian Blind design
  results in a more interdependent set of components.  That 
  is, it results in a more coupled design.

Consider the two designs in terms of reusable components:

- Because of the Russian Doll's self-contained nature it has 
  just two big elements - Book and Person.  Elements are 
  by nature not very reusable.

- The Venetian Blind design creates three highly reusable 
  components - money, Book (type), and Person (type).  
  As well, it has the Book and Person elements.

Let's create a table to summarize what we have seen in our example:

OO Ideal    Instance doc   Russian Doll   Venetian Blind
--------------------------------------------------------
Cohesion     High          High            High
 
Coupling     Low           Low             High

Resusable    n/a           Low             High
components

Toivo noted that with the Russian Doll design components are highly
cohesive with minimal coupling.  With the Venetian Blind design the
components are also highly cohesive but have high coupling.  For that
reason, Tiovo prefers that the Russian Doll be the default ("when in
doubt use the Russian Doll design").

Based upon our discussions, taking into account Cohesion, Coupling, and
Resuse, what would you make as the default design - the Russian Doll
design, the Salami Slice design, or the Venetian Blind design?  

Any other thoughts on coupling/cohesion and how it applies to schema
design?

/Roger





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS