xml-dev - RE: [xml-dev] WSIO vs. Semantic Web

RE: [xml-dev] WSIO vs. Semantic Web

[ Lists Home | Date Index | Thread Index ]

To: "Alaric Snell" <alaric@alaric-snell.com>,"Dare Obasanjo" <oludareo2@microsoft.com>,"Mark Seaborne" <MSeaborne@origoservices.com>,<xml-dev@lists.xml.org>
Subject: RE: [xml-dev] WSIO vs. Semantic Web
From: "Joshua Allen" <joshuaa@microsoft.com>
Date: Wed, 13 Feb 2002 11:10:00 -0800
Thread-index: AcG0s2r/JRvVW3aZRjiZUJTclYQVLwADWBnQ
Thread-topic: [xml-dev] WSIO vs. Semantic Web

> On Wednesday 13 February 2002 16:56, Dare Obasanjo wrote:
> > This semantic web that you describe reminds me of all the hype about
how
> > XML would make search engines smarter because it allowed people to
add
> > metadata to words in documents.
> 
> Yep!
>
> > However, no one explained how the author
> > of the document would know to tag all data in the document in a
manner
> > that would satisfy all search engines. For instance, a search for
"Dare
> > Obasanjo" could be looking for me in many contexts, it could be

Yeah, I think it goes even beyond this.  Like nobody bothered to figure
out *why* the author would spend all of that time categorizing
everything in various taxonomies.  And even more importantly, how
someone would add metadata about some web page that they don't own --
because to be truly in the spirit of the web, anyone should be able to
publish metadata that anyone else can access.  And then, how about all
of the purveyors of badness who stuff their pages with deceptive meta
tags to fool the search engines?  I think meta tags in web pages were
not a very practical start for a semantic web.

This is where google has a head start on most everyone.  The interesting
metadata is *not* the metadata that the resource author publishes about
his own resource; it is the metadata that is published independently of
the resource.  Number of inbound links is one bit of metadata about a
page.  Google randomly assigns search results to be "click-through" so
that google can sample which pages are most likely to be clicked by
users -- this too is independent metadata about a page.  Now people
running the google toolbar can have their machine automatically upload
statistics about which pages were visited.  This is metadata about those
pages (7 users found this URL interesting enough that they browsed the
page).

In the first case, google is mining for metadata, and in the second two,
google is using observation to capture metadata.  But it would not be a
terribly large step at this point to allowing a more direct user
participation in the metadata publishing.

Prev by Date: RE: [xml-dev] WSIO vs. Semantic Web
Next by Date: Re: [xml-dev] Architectural Forms, A Summary
Previous by thread: RE: [xml-dev] WSIO vs. Semantic Web
Next by thread: RE: [xml-dev] WSIO vs. Semantic Web
Index(es):
- Date
- Thread