[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: XML Schemas: Best Practices

From: "Roger L. Costello" <costello@mitre.org>
To: xml-dev@lists.xml.org
Date: Mon, 26 Feb 2001 15:24:21 -0500
Hi Folks,

Finally back from lots of travel...

Before I left we were starting discussions on techniques for creating
extensible schemas.  I would like to get back to that, but before doing
so I would like to discuss the following important issue:

Issue: When creating a schema should XMLSchema (i.e.,
http://www.w3.org/2000/10/XMLSchema) be the default namespace, or should
the targetNamespace be the default?

I have recently learned something about no-namespace components
(Chameleon components) which has convinced me that it is Best Practice
to make the targetNamespace the default.  (Below I explain how this all
fits together.)

First, let's review the two approaches:

Approach 1: Default XMLSchema, Qualify targetNamespace

Here's an example schema that shows this approach:

<?xml version="1.0"?>
<schema xmlns="http://www.w3.org/2000/10/XMLSchema"
        targetNamespace="http://www.library.org"
        xmlns:lib="http://www.library.org" 
        elementFormDefault="qualified">
    <include schemaLocation="BookCatalogue.xsd"/>
    <element name="Library">
        <complexType>
             <sequence>
                 <element name="BookCatalogue">
                     <complexType>
                         <sequence>
                             <element ref="lib:Book"
                                minOccurs="0" maxOccurs="unbounded"/>
                         </sequence>
                     </complexType>
            </sequence>
        </complexType>
    </element>
</schema>

Note that XMLSchema is the default namespace.  Consequently, all the
elements used to construct a schema - element, complexType, sequence,
schema, etc - are not explicitly qualified.  

There is a namespace prefix, lib, which is associated with the
targetNamespace.  Any references (using the "ref" attribute) use lib as
the namespace qualifier, to indicate that we are referencing an element
in the targetNamespace (in this example there is a ref to lib:Book).

Approach 2: Qualify XMLSchema, Default targetNamespace

Here's the mirror of the above schema:

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema"
        targetNamespace="http://www.library.org"
        xmlns="http://www.library.org"
        elementFormDefault="qualified">
    <xsd:include schemaLocation="BookCatalogue.xsd"/>
    <xsd:element name="Library">
        <xsd:complexType>
             <xsd:sequence>
                 <xsd:element name="BookCatalogue">
                     <xsd:complexType>
                         <xsd:sequence>
                             <xsd:element ref="Book"
                                minOccurs="0" maxOccurs="unbounded"/>
                         </xsd:sequence>
                     </xsd:complexType>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>

With this second approach all the components used to create a schema are
namespace qualified (with xsd).  

There is a default namespace declaration that declares the
targetNamespace to be the default namespace.  Any references to elements
in the schema do not need to be namespace qualified (note that the ref
to Book is not namespace qualified).

Which approach is better?  Up until a few days ago I thought that the
first approach was better, as it seemed to make for cleaner looking
schemas.  However, I have since learned something which now leads me to
believe that approach 2 is Best Practice.

Before describing why I believe that approach 2 is better, we need to
first revisit Chameleon components. 

Components that are in a schema with no targetNamespace have a very
interesting property: when they are <include>d in a schema that has a
targetNamespace then those no-namespace components take on the namespace
of the <include>ing schema.  They "blend into the background" just like
a Chameleon.  Thus, the no-namespace components are called Chameleon
components.

In the above schemas there is an <include> element to bring in the
components in BookCatalogue.xsd.  Let's consider how the above
approaches behave when the schema being <include>d (BookCatalogue.xsd)
has no namespace.  Here is BookCatalogue.xsd:

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema"
            elementFormDefault="qualified">
    <xsd:complexType name="CardCatalogueEntry">
        <xsd:sequence>
            <xsd:element name="Title" type="xsd:string"/>
            <xsd:element name="Author" type="xsd:string"/>
            <xsd:element name="Date" type="xsd:string"/>
            <xsd:element name="ISBN" type="xsd:string"/>
            <xsd:element name="Publisher" type="xsd:string" />
        </xsd:sequence>
    </xsd:complexType>
    <xsd:element name="Book" type="CardCatalogueEntry"/>
</xsd:schema>

Note that the Book element references CardCatalogueEntry (by means of
the type attribute).  We say that "the Book element is coupled to the
CardCatalogueEntry type".  This coupling is very significant with
respect to the above approaches.  Let's take each approach and see how
they behave with this no-namespace schema.

Approach 1:  Default XMLSchema, Qualify targetNamespace

The include element in the Library schema:

    <include schemaLocation="BookCatalogue.xsd"/>

results in namespace-coercing Book and CardCatalogueEntry into the
http://www.library.org namespace.  What happens to Book's reference to
CardCatalogueEntry:

   <xsd:element name="Book" type="CardCatalogueEntry"/>

Since there is no namespace qualifier on CardCatalogueEntry it is
referencing the default namespace.  What is the default namespace? 
XMLSchema is the default namespace.  There is, of course, no
CardCatalogueEntry in the XMLSchema namespace and this is an error. 

Thus, with approach 1 an error will be generated when this schema is
validated.

Now let's see how approach 2 behaves with the no-namespace schema.

Approach 2: Qualify XMLSchema, Default targetNamespace

Again, the include element:

    <xsd:include schemaLocation="BookCatalogue.xsd"/>

results in namespace-coercing Book and CardCatalogueEntry into the
http://www.library.org namespace (which in this approach is the default
namespace).  What happens to Book's reference to CardCatalogueEntry:

   <xsd:element name="Book" type="CardCatalogueEntry"/>

Since there is no namespace qualifier on CardCatalogueEntry it is
referencing the default namespace.  What is the default namespace?  The
targetNamespace is the default namespace. Thus, this reference is to
CardCatalogueEntry in the default namespace, which is exactly where it's
at!  

With this approach a schema validator says everything is fine.

Summary

Approach 1 breaks when <include>ing coupled Chameleon components.
Approach 2 supports and enables coupled Chameleon components.

For these reasons, I believe that it is Best Practice to always write
your schemas using approach 2.

What are your thoughts on this?  /Roger
Follow-Ups:
- Re: XML Schemas: Best Practices
  - From: "Roger L. Costello" <costello@mitre.org>
Prev by Date: re: Why 90 percent of XML standards will fail
Next by Date: Re: Why 90 percent of XML standards will fail
Previous by thread: RE: Why 90 percent of XML standards will fail
Next by thread: Re: XML Schemas: Best Practices
Index(es):
- Date
- Thread