xml-dev - Re: XML Schemas: Best Practices

Re: XML Schemas: Best Practices
[ Lists Home | Date Index | Thread Index ]
From: "Roger L. Costello" <costello@mitre.org>
To: xml-dev@lists.xml.org
Date: Thu, 02 Nov 2000 06:56:29 -0500
Hi Folks,

Thanks a lot Curt and Mary for your comments.  I have incorporated them
into the online document[1].  In addition, I have added a section to the
online document on the <redefine> element and describe how it is
applicable only to the Homogeneous and Chameleon Namespace designs. (For
those of you that are not familiar with the new <redefine> element take
a look at: 

http://www.xfront.com/ZeroOneOrManyNamespaces.html#redefine

it has a description of <redefine> and gives an example.  It is very
powerful and is definitely something that you want to include in your
arsenal as a schema designer.)

Below are the updated guidelines for the namespaces issue. The online
version is at:

http://www.xfront.com/ZeroOneOrManyNamespaces.html#tradeoffs

I need feedback on the guidelines:

(a) Do you agree with the guidelines?  I have made some bold statements
which you may not agree with.  You need to let me know your
disagreements.
(b) Do they make sense?  Are they understandable? 

Following the guidelines I have an intriguing question that I would be
interested in your thoughts on.

Guidelines

.. we explored the "design space" for this issue. We looked at the three
design approaches "in action", both schemas and instance documents. So
which design is better? Under what circumstances?

When you are reusing schemas that someone else created (e.g., the XHTML,
SVG schemas) you should <import> the components in those schemas, i.e.,
use the Heterogeneous Namespace design. It is a bad idea to copy those
components into your namespace, for two reasons: (1) soon your local
copies would get out of sync with the other schema, and (2) you lose
interoperability with any existing applications that process the other
schema's components (e.g., an SVG engine would be able to process
svg:line, but not company:line).

The interesting case (the case we have been considering throughout this
discussion) is how to deal with namespaces in a collection of schemas
that you created. Here's our guidelines for this case:

The Chameleon Namespace design is the preferred approach as it has the
most flexible design: 

- The components in the schemas with no targetNamespace (the
"no-namespace" components) are infinitely malleable - they are able to
take on the namespace of any schema that <include>s or <redefine>s them
(the Chameleon effect) 

- The no-namespace components can be reused by any schema. 

- The no-namespace components can be <redefine>d by any schema,
regardless of the schema's targetNamespace. Note that neither of the
other designs support this capability. (As we saw above, the Homogeneous
Namespace design also enables use of the <redefine> element. However,
with that design approach the <redefine> capability is only applicable
where the components are in the same namespace. Thus, a schema in, say,
the auto namespace wouldn't be able to <redefine> a component in the
company namespace.) 

- The no-namespace components are not "fenced in" by a namespace. They
are free, independent, and with no boundaries. They owe their allegiance
to no namespace! 

To optimize the benefits of the Chameleon Namespace design follow these
practices: 

Minimize the functionality in the "has-namespace" schema, i.e., a
"skinny" main schema 

Maximize the functionality in the "no-namepace" schemas, i.e., "fat"
supporting schemas 

... the Chameleon Namespace design approach has restrictions on how the
no-namespace components must be designed for them to be usable by other
schemas. Namely, they must not reference one another.  The components
must be decoupled (which is a desirable trait).

If you are not able to use the Chameleon Namespace design (perhaps
because the components reference one another) then we recommend that you
use the Homogeneous Namespace design. This design affords your schemas
all the reuse and <redefine> capabilities that the Chameleon Namespace
design provides. What it does not afford is for other schemas in other
namespaces to <redefine> your components.

The Heterogeneous Namespace design should be considered as the last
resort. The benefit of the Heterogeneous Namespace design is that it
enables you to organize your schemas within different namespaces, e.g.,
"schemas A, B, and C are related so we will put them in namespace 1,
schemas D and E are related so we will put them in namespace 2, etc."
This will be useful when there are multiple elements with the same name.
By placing them in different namespaces the instance documents will be
able to distinguish them by namespace (although in many cases the
context is sufficient to distinguish different elements with the same
name).

There are several disadvantages to the Heterogeneous Namespace design: 

- you will not be able to use the <redefine> capability across namespace
boundaries. That is, if schema A is in namespace 1, schema D is in
namespace 2, then schema A will not be able to <redefine> components in
schema D 

- it's more effort to work with multiple namespaces than with a single
namespace, not only in the schema but also in instance documents
(observe the instance document listed earlier where we had to create
namespace declarations for all the namespaces and then qualify each
element. Contrast that with the uni-namespace version where we simply
created a default namespace) 

Editor's Note: it is interesting to observe that the Chameleon Namespace
design and the Heterogeneous Namespace design are at opposite ends of
the spectrum, in terms of namespaces. The Chameleon design espouses "no
(namespace) fence", free, independent components. The Heterogeneous
design, on the other hand, espouses organizing, structuring via
namespaces.

Okay, now for my "intriguing question":

How would you rate the following capabilities in terms of importance for
schema design:

[1] the grouping and name disambiguation provided by namespaces
[2] the ability to reuse components in other schemas
[3] the ability to redefine components in other schemas (using
<redefine>)

For instance, would you trade off the ability to redefine components in
other schemas (using <redefine>) for the grouping and name
disambiguation provided by namespaces?  i.e. Is [1] > [3] in importance?

How important is the grouping and name disambiguation provided by
namespaces?  Would you say that it is better to "free" your schemas of
namespace boundaries; allowing components to go anywhere, morph to any
namespace?

Thanks!  /Roger

[1] http://www.xfront.com/ZeroOneOrManyNamespaces.html
Prev by Date: Re: Document Root
Next by Date: Re: XML + default CSS
Previous by thread: Please just ignore my last mail
Next by thread: Re: XML Schemas: Best Practices
Index(es):
- Date
- Thread