XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
XML Design: a series of an item or a collection of specific typesof items?

Hi Folks,

I have data about Books. 

Some books are fiction and some are non-fiction.

Regardless of whether a book is fiction or non-fiction, I have title and author information.

If a book is fiction, I have data about the age-group for which it is intended: teen, young adult, adult, or all.

If a book is non-fiction, I have data about its field: math, chemistry, physics, or astronomy.

Below are two designs. When would one design be preferred over the other? What factors would push you toward adopting one design over the other?

Design #1 - Series of Book Elements

Here is an example to illustrate this design:

<Books>
    <Book>
        <Type>Non-Fiction</Type>
        <Field>Astronomy</Field>
        <Title>Cosmos</Title>
        <Author>Carl Sagan</Author>
    </Book>
    <Book>
        <Type>Fiction</Type>
        <Age-Group>All</Age-Group>
        <Title>The Alchemist</Title>
        <Author>Paulo Choelho</Author>
    </Book>
</Books>

There is a series of <Book> elements. The <Type> element identifies the kind of book. The data that is common to all books - Type, Title, and Author - is included in each Book element. The data that is unique to non-fiction books - Field - is only included in the non-fiction Book element.  The data that is unique to fiction books - Age-Group - is only included in the fiction Book element.  

A grammar for this design specifies that Books contain any number of Book elements:

Books --> Book+

The grammar rule for Book mandates the common elements - Type, Title, and Author - and makes the type-specific elements - Field and Age-Group - optional:

Book --> Type, Field?, Age-Group?, Title, Author

An unfortunate aspect of this design is that someone creating an XML instance document could accidentally create a non-fiction book that includes the Age-Group element. We could use Schematron to prevent this.

The beauty of this design is that a query for Books will return all the Books. If, in the future there are also, say, books of type History and Philosophy then the query will still work. Thus, this design is extensible, at least from a query perspective.

(Personally, I like the simple, repetitive nature of this design. And I am not concerned about the need for adding an additional layer of Schematron validation because I typically supplement grammar-based validation with Schematron co-constraint validation.)

Design #2 - Collection of Fiction and Non-Fiction Elements

Here is an example to illustrate this design:

<Books>
    <Non-Fiction>
        <Field>Astronomy</Field>
        <Title>Cosmos</Title>
        <Author>Carl Sagan</Author>
    </Non-Fiction>
    <Fiction>
        <Age-Group>All</Age-Group>
        <Title>The Alchemist</Title>
        <Author>Paulo Choelho</Author>
    </Fiction>
</Books>

The content of the Books element is a repeatable choice of either a Non-Fiction element or a Fiction element. 

A grammar for this design specifies that Books contains a repeatable choice of either a Non-Fiction element or a Fiction element:

Books --> (Non-Fiction | Fiction)+

The grammar rule for Non-Fiction mandates the common elements - Title and Author - and mandates its type-specific element - Field:

Non-Fiction --> Field, Title, Author

The grammar rule for Fiction mandates the common elements - Title and Author - and mandates its type-specific element - Age-Group:

Fiction --> Age-Group, Title, Author

An unfortunate aspect of this design is that querying for all Books might not be easy, especially if there are, say, Magazine and CD elements mixed along with the Fiction and Non-Fiction elements. The query would have to call out each type of book: "Give me all Non-Fiction and Fiction elements." If, in the future there are also, say, History and Philosophy elements then the query will have to be modified. Thus, this design is not extensible, at least from a query perspective.

The beauty of this design is that there is no concern for someone accidentally creating a non-fiction book that includes the Age-Group element.

(Personally, I don't like this design. The non-repetitiveness is bothersome to me. And it requires additional grammar rules, which makes things more complicated.)
--------------------------------

Okay, I'd like to hear your thoughts. From your experience, how do the two designs impact querying? Application processing? Schematron development? Etc.? Which design would you choose? Why?

/Roger



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS