OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] What are the characteristics of a good type system for XML

[ Lists Home | Date Index | Thread Index ]

From: "Jeni Tennison" <jeni@jenitennison.com>

> Finally, I think that there should be limits on the scope of a type
> definition. XML types like ID and ENTITY have too wide a scope, in my
> opinion, in that they specify constraints across entire documents as
> well as on particular lexical representations.  

I wonder if the reverse is not true: that IDs don't have enough scope.

I am increasingly thinking that most of the criticisms of IDs, that are that
they are poor keys, begs the question that IDs may not really be
keys at all.  (Or, rather, that thinking of them in terms of keys diverts
attention from their primary role as link targets.)

In the general usage of key such as [1] "Candidate key:  <database> 
  One of several possible attributes or combinations of attributes which 
  can be used to uniquely identify a body of information (a "record")."
we could say an ID is a key.

But if we look at relational usage, in Codd's 12 rules[2] for databases 

  "Rule 2: Guaranteed Access Rule 
  Each and every datum (atomic value) in a relational database is 
  guaranteed to be logically accessible by resorting to a table name, 
  primary key value, and column name. "

Because IDs can be #IMPLIED, they are not anywhere near the 
same animal as the primary key in (relational) database terms.
So what animal are they?  Perhaps secondary keys? 

  "A secondary index, put simply, is a way to efficiently access records 
  in a database (the primary) by means of some piece of information other 
  than the usual (primary) key...Secondary indices can be (and often are) 
  created manually by the application" [3]

So IDs are not really primary keys, because every atom of information 
cannot be found using them + some static information.  And IDs 
are a special case for secondary keys, because they are unique 
within a document.

(Indeed, there is a further problem with thinking of IDs in terms of
database keys: that is in thinking that an XML document is indeed
a database, in the sense of being a collection of facts.)

IDs are one end of the non-tree structure in an XML document,
a target for links.  So perhaps they don't have enough scope:
perhaps documents should be given "keyscopes" which allows
elements with IDs from one document to be pasted into another.
Here is a mechanism, rather like the dreaded namespaces (no
flames please), to demonstrate what I mean:

<x>
    <y keyscope="some uri">
        <z id="a1" />
        <zref idref="a1"/>
    </y>
    <y keyscope="some other uri"> 
        <z id="a1" />
        <zref idref="a1"/>
        <zref idref="aa:a1"  keyscope:aa="some uri"/>
    </y>
</x>

In other words, instead of (or as well as) the typing focus of
what elements can be pointed by another element (i.e. the
kinds of concerns that keys raise) perhaps we need to consider
the modularity/document composition concern of how to
allow cut and paste between documents without having to reallocate 
IDs.  So that IDs are universally unique. 

What might a suitable keyscope URI be? Well, initially it can be
the original document that the data was cut and pasted from.
This brings us to a kind of transclusion: we keep track of the
source of the fragment. 

<x>
    <y keyscope=" http://www.eg.com/somedocument.xml ">
        <z id="a1" />
        <zref idref="a1"/>
    </y>
    <y> 
        <z id="a1" />
        <zref idref="a1"/>
        <zref idref="aa:a1"  keyscope:aa=" http://www.eg.com/somedocument.xml "/>
    </y>
</x>

where some smart software could figure out that there was some equivalence between
<zref idref="aa:a1"  keyscope:aa=" http://www.eg.com/somedocument.xml "/>
and
 <zref xlink:href="http://www.eg.com/somedocument.xml#a1 "/>

How would you link to it? You could just link to the original document 
http://www.eg.com/somedocument.xml#a1   Or some XPointer scheme could be made: http://www.eg.com/newdocument#id(http://www.eg.com/somedocument.xml#a1) 
or perhaps by using some keyspace declarator
http://www.eg.com/newdocument#keyspace(http://www.eg.com/somedocument.xml, aa)id(aa:a1) 

Cheers
Rick Jelliffe

[1] http://burks.brighton.ac.uk/burks/foldoc/66/92.htm
[2] http://newton.uor.edu/FacultyFolder/CKettemborough/Codd12R.html
[3] http://www.sleepycat.com/docs/ref/am/second.html




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS