OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: Success factors for the Web and Semantic Web

[ Lists Home | Date Index | Thread Index ]
  • From: Miles Sabin <MSabin@interx.com>
  • To: xml-dev@lists.xml.org
  • Date: Thu, 21 Dec 2000 16:14:22 +0000

Francis Norton wrote,
> The problem with meta tags is that they are categorised self-
> citations. If we agree that Google is good now, could it not be 
> even better if sites categorised their external citatations? 

Not necessarily, no.

Sure, annotating links seems like it might solve the problem of 
distinguishing between,

  <a href="http://foo.example.com/rubbish.html">This is junk</a>


  <a href="http://bar.example.com/cool.html">Worth a look</a>

And maybe it does, up to a point, but as Mike pointed out we 
still have the problem of authority and trust to deal with. If I 
link to w3.org but categorize the link as a link to "fisheries, 
haddock" what's a poor search engine to do? Sure, we can try and
aggregate over many link categorizations and hope that nonsense, 
mistakes and damned lies will get drowned out. But how do you 
aggregate categorizations exactly? 

Ignoring metadata and working with the raw link topology is
driven by the assumptions that referrers (ie. human authors) will
more often than not make relevant links; that if several things
are all linked to from the same place they're quite likely to be
related in some way; and that links will cluster naturally around 
relatively distinct topics of interest rather than degenerating 
into mush. Notice that we can say all of this without once having
to worry about what any of the stuff actually _means_.

We also get something else, not for free, but at least tractably.
To pick up the theme of another thread, link topologies are
essential global, whereas link metadata is typically local. Where
the former is the result of the activites of numerous, mutually
oblivious authors with overlapping areas of knowledge, interest 
and expertise, the latter is typically the product of individuals 
or small groups with particular, partial, interests. Making 
metadata global in any useful way requires massive coordinated 
intellectual and political effort (Simon's already raised some 
doubts about whether or not we should consider that an 
unqualified good). Global link topology just needs a warehouse 
full of servers and a ludicrous amount of bandwidth.

The assumptions behind this approach seem pretty plausible a 
priori, and both Google and my long-time favourite domain-
specific search engine, ResearchIndex (aka CiteSeer)[1] seem to 
back them up. Then again, I've always used bibliographies as my 
primary research tool, so maybe I'm biased.



[1] No need for a URI, you can find it via Google ;-)

Miles Sabin                               InterX
Internet Systems Architect                5/6 Glenthorne Mews
+44 (0)20 8817 4030                       London, W6 0LJ, England
msabin@interx.com                         http://www.interx.com/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS