OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
ID/IDREF is evil

Hi Folks,

In this message I will attempt to persuade you:

1. Do not use the ID/IDREF capability.

2. Use a layering approach: 

	(a) Layer 1: express your XML as a context-free grammar.

	(b) Layer 2: express context-sensitive rules using Schematron.

3. The ID/IDREF capability is a context-sensitive rule.

Now for my argument:

First, let me persuade you that by using ID/IDREF you have introduced context-sensitive rules into your XML. Consider this XML, which does not use ID/IDREF:

      <Title>Principles of Programming</Title>
      <Author>M. A. Jackson</Author>

To show XML's rule nature, let's express it like so:

Book 	--> Title Author
Title 	--> string
Author 	--> string

That's a context-free grammar. 

Now let's add an ID/IDREF:

<Book seller="Amazon">
      <Title>Principles of Programming</Title>
      <Author>M. A. Jackson</Author>

Assume that @seller is of type IDREF. I don't show the corresponding ID attribute.

Let's express that XML using grammar rules. The rule for the Book element depends on the existence of a corresponding ID attribute; if there is none, the Book rule is invalid. So we may express Book's rule like so:

Book Amazon --> Title Author

Read that as:
	In the context of an Amazon symbol 
	the Book element may be replaced 
	by Title and Author.

In other words, our grammar tells us that this a valid string 

	Principles of Programming M. A. Jackson

only if the symbol "Amazon" exists.

See the context-sensitivity? Book is context-sensitive due to the ID/IDREF.

Any time you use ID/IDREF in your XML document you have introduced a context-sensitive rule into your XML document.

"So what?" you ask.

Well, here's so what:
	All known parsing algorithms for context-sensitive
	grammars are either very inefficient or very complex.

	Reasoning about context-sensitive grammars is difficult.

	Proofs about context-sensitive grammars is difficult.

	Take cue from compiler developers: they separate
	context-sensitive processing into a separate pass.

So don't use ID/IDREF. 

Of course, that doesn't mean you will never have data that has intra-data dependencies. What it means is that you should modularize your grammar rules: express your context-free rules in your XML document and express your context-sensitive rules (intra-data dependencies) in Schematron. That's a nice, clean separation-of-concerns. That's a modular data design.

Let's recap:

1. ID/IDREF introduces context-sensitive rules into your XML grammar wherever there is an ID attribute and wherever there is an IDREF attribute.

2. Don't use ID/IDREF.

3. Modularize your rules: express context-free rules in XML and express context-sensitive rules in Schematron.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS