xml-dev - RE: XML Schemas: Best Practices

RE: XML Schemas: Best Practices

[ Lists Home | Date Index | Thread Index ]

From: "Arnold, Curt" <Curt.Arnold@hyprotech.com>
To: "'Roger L. Costello'" <costello@mitre.org>
Date: Thu, 14 Dec 2000 13:21:54 -0700

I'd start a new topic so that element substitution discussion
could be distinguished the previous discussions.

I've been tempted to start a "Worst Practices" thread that discusses either legitimate requirements that can be accomplished by kludges (that should be preempted by modifications to the XML Schema
such as use of <xsd:unique> to accomplish cocurrence constraints) or overly aggressive use of new features in XML Schema.

Element substitution is primarily (at least in my opinion) one mechanism for expressing generalization (aka inheritance).  It really shouldn't be considered apart of a general discussion of patterns
for generalization.

If a "Metro" is a type of "Subway" but has some additional features or constraints, then there are a couple of ways that you could express this in an XML structure.

1) Projection

The specialized element has all the attributes and content of the more generalized element plus optionally some additional infomation.

<!ELEMENT subway (#PCDATA)>
<!ELEMENT metro (#PCDATA)>
<!ATTLIST metro xml:lang CDATA "FR" #FIXED>

The advantage of this pattern is that there is only one element for the "object".  One disadvantage is that every processing application must know that metro is a subtype of subway if it wants to
extract all subway names.  Also, XPath queries can get complicated since they now have to do queries like "//*[local-name() = 'subway' or local-name() = 'metro' or local-name() = 'tube']/text()" to
get the names of all the "subways" in a document.  

In a future world of Schema aware technologies, you may be able to do a query based on the type of an element, however even that would require every participant to have access to the schema
information which might be undesireable.

Equivalence groups (or named choice groups) can be used to implement this pattern in the schema.

2) Aggregation

The general element contains child elements that contain the information specific to the specific subtypes.

<!ELEMENT subway ((metro|tube)?,name)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT metro EMPTY>
<!ATTLIST metro xml:lang CDATA "FR" #FIXED>

The advantage to this pattern is that generic "subway" information can be extracted in a uniform manner without concern over what "specialization" of "subway" appear.  Only if you want to get
information specific to a subtype do you need to know anything about the subtypes.  The disadvantage is that the DTD or schema that defines the containing element must be changed when a new subtype is
added.

3) Decoration

A specific subtype wrapper contains the general element

<!ELEMENT metro (subway)>
<!ATTLIST metro xml:lang CDATA "FR" #FIXED>
<!ELEMENT subway (#PCDATA)>

This pattern is useful in the case where the schema that defines "subway" is unalterable.  Unfortunately, it means that applications that process the document must be aware of any subtypes (or at
least assume any element that contains a subway is a type of a subway).

I think that the use of alternative language tags is a bad practise in that all implementations have to be aware of all localized names.

*[local-name() = 'subway' or local-name() = 'Fußgängerunterführung' or
local-name() = 'sottopassaggio' or local-name = 'subterráneo' or ...]

Localization is appropriate on display, however the infrastructure should work on a consistent set of locale-insensitive elements.

Again, a schema-aware XPath might be able to determine that all these are equivalent from the schema information, but that is a lot of unnecessary processing for a dubious benefit.  Anybody who was
trying to debug the transform, for example, would have to be fluent in all the languages.

Prev by Date: RE: Extreme Programming goes mainstream?
Next by Date: Visual Transformations: Bread and Butter Transformations?
Previous by thread: Re: XML Schemas: Best Practices
Next by thread: RE: XML Schemas: Best Practices
Index(es):
- Date
- Thread