Lists Home |
Date Index |
Umm... the social pressures to provide reasonably
good metadata is as strong as the incentive to provide
good content. Doctorow mentions the problem of the
"Plam Pilot" and the data that failing to provide
accurate data drives down bids on that data so bad
data ends up providing a good price for the buyer
even if a bad deal for the seller.
Bad ontologies and bad data work the same way.
Anyone trying to get the bots to shop their
services will be doing their best to be accurate
where accuracy is useful, and shady where the
bots can be fooled. No change.
From: Thomas B. Passin [mailto:email@example.com]
> If you had been following the thread, you would have seen the issue
> already addressed. The assumption that "semantic web" means "metadata
> produced by a publisher about his own pages" is invalid, and is the
> straw man upon which much of the "metacrap" arguments lie.
That's right. It is very likely that of all the things that will be
done to improve search and retrieval, use of self-meta data will be
among the least. Not that it is useless, but
1) it *may* be lies (or poorly chosen or in error, doesn't have to be
2) Most material on the web won't have such meta data anyway, not for a
long time if ever.
One way in which self-meta data could become more useful would be by
what I call "social analysis". PageRank is one form of social analysis,
and there are many other possibilities. I think it is very possible
that eventually there could be enough information out there that a given
author's or site's claims (internal meta data) could be assessed by
analyis of the "cloud" out there. Of course, we still need the
algorithms and processors, but give it time.
> Furthermore, it's not an issue of Google vs. "semantic web". The two
> are completely orthogonal. Google/MSN could index triples if they
> chose. Google/MSN could expose their derived metadata (page rank, etc.)
> using an open triples format if they chose. In fact, the two could be
> very complimentary.
Right again. It would be especially good if we could eventually get an
agreed-on vocabulary and format for returning and annotating search
results. You have to include the "annotating" bit to get the most