Hi Folks,
In this message I will attempt to persuade you:
1. Do not use the ID/IDREF capability.
2. Use a layering approach:
(a) Layer 1: express your XML as a context-free grammar.
(b) Layer 2: express context-sensitive rules using Schematron.
3. The ID/IDREF capability is a context-sensitive rule.
Now for my argument:
First, let me persuade you that by using ID/IDREF you have introduced context-sensitive rules into your XML. Consider this XML, which does not use ID/IDREF:
<Book>
<Title>Principles of Programming</Title>
<Author>M. A. Jackson</Author>
</Book>
To show XML's rule nature, let's express it like so:
Book --> Title Author
Title --> string
Author --> string
That's a context-free grammar.
Now let's add an ID/IDREF:
<Book seller="Amazon">
<Title>Principles of Programming</Title>
<Author>M. A. Jackson</Author>
</Book>
Assume that @seller is of type IDREF. I don't show the corresponding ID attribute.
Let's express that XML using grammar rules. The rule for the Book element depends on the existence of a corresponding ID attribute; if there is none, the Book rule is invalid. So we may express Book's rule like so:
Book Amazon --> Title Author
Read that as:
In the context of an Amazon symbol
the Book element may be replaced
by Title and Author.
In other words, our grammar tells us that this a valid string
Principles of Programming M. A. Jackson
only if the symbol "Amazon" exists.
See the context-sensitivity? Book is context-sensitive due to the ID/IDREF.
Any time you use ID/IDREF in your XML document you have introduced a context-sensitive rule into your XML document.
"So what?" you ask.
Well, here's so what:
All known parsing algorithms for context-sensitive
grammars are either very inefficient or very complex.
Reasoning about context-sensitive grammars is difficult.
Proofs about context-sensitive grammars is difficult.
Take cue from compiler developers: they separate
context-sensitive processing into a separate pass.
So don't use ID/IDREF.
Of course, that doesn't mean you will never have data that has intra-data dependencies. What it means is that you should modularize your grammar rules: express your context-free rules in your XML document and express your context-sensitive rules (intra-data dependencies) in Schematron. That's a nice, clean separation-of-concerns. That's a modular data design.
Let's recap:
1. ID/IDREF introduces context-sensitive rules into your XML grammar wherever there is an ID attribute and wherever there is an IDREF attribute.
2. Don't use ID/IDREF.
3. Modularize your rules: express context-free rules in XML and express context-sensitive rules in Schematron.
Comments?
/Roger
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php