[
Lists Home |
Date Index |
Thread Index
]
On Jun 8, 2004, at 6:37 PM, Joshua Allen wrote:
>>
>> http://www.well.com/~doctorow/metacrap.htm
>
> The fact that someone wrote a flame piece without the slightest
> understanding of the actual issues does not surprise me. This happens
> often on the Internet. However, the fact that otherwise intelligent
> people cite the screed without being able to defend it, and without
> using their own brains to understand the actual issues, *does* surprise
> me.
>
OK, I'll defend it -- Tell me which of these you disagree with :-)
◦ 2.1 People lie
◦ 2.2 People are lazy
◦ 2.3 People are stupid
◦ 2.4 Mission: Impossible -- know thyself
◦ 2.5 Schemas aren't neutral
◦ 2.6 Metrics influence results
◦ 2.7 There's more than one way to describe something
But seriously, Doctorow concludes:
" Metadata can be quite useful, if taken with a sufficiently large
pinch of salt. The meta-utopia will never come into being, but metadata
is often a good means of making rough assumptions about the information
that floats through the Internet.
Certain kinds of implicit metadata is awfully useful, in fact. Google
exploits metadata about the structure of the World Wide Web: by
examining the number of links pointing at a page (and the number of
links pointing at each linker), Google can derive statistics about the
number of Web-authors who believe that that page is important enough to
link to, and hence make extremely reliable guesses about how reputable
the information on that page is.
This sort of observational metadata is far more reliable than the
stuff that human beings create for the purposes of having their
documents found."
What is there to disagree with here? I suppose you could say that
Doctorow underestimated the ability of Google-bombers and search engine
optimizer consultants to game Google, but that would tend to make one
more depressed about the prospects for reliable metadata, not less
depressed. Conversely, I guess in TimBL's version of meta-utopia as
opposed to the META tag strawman that Doctorow demolishes,
human-created metadata is to be considered reliable only if trusted
sources assert it's reliable. I personally find this the least
plausible part of the semantic web vision -- I won't even begin to
believe it until it has survived the onslaughts of the meta-spammers
and the semantic-bombers who will go after the semantic web the way
they've gone after data in meta tags and the links that Google
harvests.
Dare said something the other day about having second thoughts about
Doctorow's argument because RSS feeds are an existence proof that
useful metadata is practical. I'm not sure which of the straw men that
demolishes -- I'd agree that people are less likely to lie or act lazy
and stupid when they know that people(like the boss, or colleagues, or
potential employers) are watching. And anyway, RSS *is* mostly
observational metadata extracted from an article or post, or at least
generated from the same inputs used to generate the content it
syndicates.
My only quarrel with "meta-utopia will never come into being" is a
variation of this -- in organizations that value metadata and put in
business processes to record, manage, and exploit it, there will be all
sorts of pressures to make it clean, consistent, usable, etc. even if
the people entering it are lying, lazy, stupid, and not self aware when
they are on their own time :-) It CAN come into being, but it is hard,
and even in well-managed organizations the best metadata will probably
be "observational" metadata from application data dictionaries,
existing database schema, standard operating procedures, etc.
On the other hand, Doctorow's "screed" does call into question the
WinFS vision, or am I missing something here? To what extent does
WinFS not presuppose honest, energetic, intelligent, and self-aware
humans to create the metadata it will manage and query?
|