OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: "Multiple" Namespaces? (but NOT for HTML)

[ Lists Home | Date Index | Thread Index ]
  • From: Paul Prescod <paul@prescod.net>
  • To: xml-dev@ic.ac.uk
  • Date: Wed, 27 Oct 1999 19:58:39 -0500

It seems that most of the "usual suspects" did not get around to
discussing your problem which is a pity because I think that it is one
of the most fundamental in the XML world. When I win the lottery, I will
spend a year or two studying it. Like other problems you've brought up,
it is at (er, probably past) the boundary of what we know how to do
scalably with modern technology. Sorry.

I claim that your problem is, in fact, intimately related not only to
the multiple HTML namespaces problem but also to the representation of
XLinks.

Let me first suggest that the solution to your problem is probably not
to put various element type names in one tag. I could be wrong on this
point so I'll trust you to set me straight if that's the case.

>         <DC:Creator GILS:Originator TEI:docAuthor>Tillich</DC:Creator
>                 GILS:Originator TEI:docAuthor>

Now you've said explicitly that your goal is to avoid duplicating the
data in your documents in multiple documents. But is duplicating the
semantic "author" better? I'm guessing that DC:Creator is *always* going
to be a synonym for TEI:docAuthor which means that saying so explicitly
in the document is redundant. It causes all of the usual problems of
database redundancy:

 * It increases the size of your database: it will quadruple (at least)
your indexes. 
 * It increases the possibility for error: authors or data generators
could "forget" to insert a TEI:docAuthor alongside a DC:Creator.
 * It reduces optimization opportunities because the database won't
cache "synonyms" properly.

Old fashioned SGML smelly-ness aside, architectural forms were designed
to solve exactly this problem. Proponents claim that one of their great
virtues is that they allow you to do the mapping in EITHER the document
(duplicating data) OR the DTD (centralizing it). I'm not really happy
with the fact that it allows the "inline" mode, but the "centralized"
mode is just what you need. If you can convince me that you really need
multiple element type names *in each and every tag* then you will be the
first to do so.

As far as your "standards based" requirement: you can't beat an
"International Standard". 

Architectural forms are expressed as attributes but they are supposed to
be INTERPRETED by an architectural processor (like nsgmls and jade) as 
if they were element type names (generic identifiers).  The syntax is,
IMHO, a hack to avoid violating XML's (and SGML's) rules. Note that
XLink borrows heavily from the hack.

I claim then, that what you need is a database that understands either
architectural forms or some similar technology. It would index in terms
of synonyms and recognize that asking for one synonym is as easy as
asking for another. As far as I know, architectural form indexing and
caching has never been implemented in a large-scale (multi-gigabyte) XML
database system but I could be wrong.

There is hope, however. "Out of line" architectural forms are about to
be reinvented as "archetypes." Once they are reinvented in a syntax that
is OO-friendly and W3C approved, it will become obvious that people will
need to do XPath-like queries based not only on element types, but also
on archetypes. Finally, search engine vendors are likely to "get it."
Whether they will be able to develop scalable algorithms to do it in the
general case is another question...

 Paul Prescod



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS