[
Lists Home |
Date Index |
Thread Index
]
- From: Miles Sabin <MSabin@interx.com>
- To: xml-dev@lists.xml.org
- Date: Thu, 21 Dec 2000 16:14:22 +0000
Francis Norton wrote,
> The problem with meta tags is that they are categorised self-
> citations. If we agree that Google is good now, could it not be
> even better if sites categorised their external citatations?
Not necessarily, no.
Sure, annotating links seems like it might solve the problem of
distinguishing between,
<a href="http://foo.example.com/rubbish.html">This is junk</a>
and,
<a href="http://bar.example.com/cool.html">Worth a look</a>
And maybe it does, up to a point, but as Mike pointed out we
still have the problem of authority and trust to deal with. If I
link to w3.org but categorize the link as a link to "fisheries,
haddock" what's a poor search engine to do? Sure, we can try and
aggregate over many link categorizations and hope that nonsense,
mistakes and damned lies will get drowned out. But how do you
aggregate categorizations exactly?
Ignoring metadata and working with the raw link topology is
driven by the assumptions that referrers (ie. human authors) will
more often than not make relevant links; that if several things
are all linked to from the same place they're quite likely to be
related in some way; and that links will cluster naturally around
relatively distinct topics of interest rather than degenerating
into mush. Notice that we can say all of this without once having
to worry about what any of the stuff actually _means_.
We also get something else, not for free, but at least tractably.
To pick up the theme of another thread, link topologies are
essential global, whereas link metadata is typically local. Where
the former is the result of the activites of numerous, mutually
oblivious authors with overlapping areas of knowledge, interest
and expertise, the latter is typically the product of individuals
or small groups with particular, partial, interests. Making
metadata global in any useful way requires massive coordinated
intellectual and political effort (Simon's already raised some
doubts about whether or not we should consider that an
unqualified good). Global link topology just needs a warehouse
full of servers and a ludicrous amount of bandwidth.
The assumptions behind this approach seem pretty plausible a
priori, and both Google and my long-time favourite domain-
specific search engine, ResearchIndex (aka CiteSeer)[1] seem to
back them up. Then again, I've always used bibliographies as my
primary research tool, so maybe I'm biased.
Cheers,
Miles
[1] No need for a URI, you can find it via Google ;-)
--
Miles Sabin InterX
Internet Systems Architect 5/6 Glenthorne Mews
+44 (0)20 8817 4030 London, W6 0LJ, England
msabin@interx.com http://www.interx.com/
|