Re: [xml-dev] ID/IDREF is evil

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: Michael Sokolov <msokolov@safaribooksonline.com>
To: "Costello, Roger L." <costello@mitre.org>, "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
Date: Mon, 03 Feb 2014 17:23:00 -0500

I thought ID/IDREF was about expressing the integrity of links (IDREF must have a matching ID), not about context-sensitivity, but then again I never used it since my links were always to different documents, and it seemed useless (rather than evil) so I could be wrong.

-Mike

On 02/03/2014 05:05 PM, Costello, Roger L. wrote:

Hi Folks,

In this message I will attempt to persuade you:

1. Do not use the ID/IDREF capability.

2. Use a layering approach:

(a) Layer 1: express your XML as a context-free grammar.

(b) Layer 2: express context-sensitive rules using Schematron.

3. The ID/IDREF capability is a context-sensitive rule.

Now for my argument:

First, let me persuade you that by using ID/IDREF you have introduced context-sensitive rules into your XML. Consider this XML, which does not use ID/IDREF:

<Book>
<Title>Principles of Programming</Title>
<Author>M. A. Jackson</Author>
</Book>

To show XML's rule nature, let's express it like so:

Book --> Title Author
Title --> string
Author --> string

That's a context-free grammar.

Now let's add an ID/IDREF:

<Book seller="Amazon">
<Title>Principles of Programming</Title>
<Author>M. A. Jackson</Author>
</Book>

Assume that @seller is of type IDREF. I don't show the corresponding ID attribute.

Let's express that XML using grammar rules. The rule for the Book element depends on the existence of a corresponding ID attribute; if there is none, the Book rule is invalid. So we may express Book's rule like so:

Book Amazon --> Title Author

Read that as:

In the context of an Amazon symbol
the Book element may be replaced
by Title and Author.

In other words, our grammar tells us that this a valid string

Principles of Programming M. A. Jackson

only if the symbol "Amazon" exists.

See the context-sensitivity? Book is context-sensitive due to the ID/IDREF.

Any time you use ID/IDREF in your XML document you have introduced a context-sensitive rule into your XML document.

"So what?" you ask.

Well, here's so what:

All known parsing algorithms for context-sensitive
grammars are either very inefficient or very complex.

Reasoning about context-sensitive grammars is difficult.

Proofs about context-sensitive grammars is difficult.

Take cue from compiler developers: they separate
context-sensitive processing into a separate pass.

So don't use ID/IDREF.

Of course, that doesn't mean you will never have data that has intra-data dependencies. What it means is that you should modularize your grammar rules: express your context-free rules in your XML document and express your context-sensitive rules (intra-data dependencies) in Schematron. That's a nice, clean separation-of-concerns. That's a modular data design.

Let's recap:

1. ID/IDREF introduces context-sensitive rules into your XML grammar wherever there is an ID attribute and wherever there is an IDREF attribute.

2. Don't use ID/IDREF.

3. Modularize your rules: express context-free rules in XML and express context-sensitive rules in Schematron.

Comments?

/Roger

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

Follow-Ups:
- Re: [xml-dev] ID/IDREF is evil
  - From: Gareth Oakes <goakes@gpslsolutions.com>

References:
- ID/IDREF is evil
  - From: "Costello, Roger L." <costello@mitre.org>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]