xml-dev - RE: [xml-dev] Semantic Web permathread, iteration n+1 (was Re:[xml-dev]

RE: [xml-dev] Semantic Web permathread, iteration n+1 (was Re:[xml-dev]

[ Lists Home | Date Index | Thread Index ]

To: "Joshua Allen" <joshuaa@microsoft.com>
Subject: RE: [xml-dev] Semantic Web permathread, iteration n+1 (was Re:[xml-dev] InfoWorld agrees with Elliote Rusty Harold)
From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Date: Thu, 3 Jun 2004 14:26:22 -0400
Cc: <xml-dev@lists.xml.org>
In-reply-to: <0E36FD96D96FCA4AA8E8F2D199320E5202009DAF@RED-MSG-43.redmond.corp.microsoft.com>
References: <0E36FD96D96FCA4AA8E8F2D199320E5202009DAF@RED-MSG-43.redmond.corp.microsoft.com>

At 11:04 AM -0700 6/3/04, Joshua Allen wrote:

>By the same token, if I want to get a new field of metadata published in
>Google's index, it's even more difficult.  I don't even know who to pay.
>The engine indexes things like pagerank, related pages, keywords.  But
>if I want any other information in there, I have to get a job as a
>janitor at Google's offices and hack the source code.  Now, if I build
>my own engine for storing additional metadata like annotations, ratings,
>etc., I have to build adapters (and probably violate some EULA) to get
>that to integrate with Google or MSN Search.  These systems are all
>walled gardens.

The competition is not between two different metadata systems, one of 
which is open and one of which is closed. Google does not provide 
metadata. It provides answers to questions. The semantic web attempts 
to provide answers to questions by having authors annotate their 
content with metadata. Google skips the metadata entirely. This is 
another reason I think Google is fundamentally superior to the vision 
of the semantic web.

Google is based on what web sites present to people: the actual text 
content of the pages and the text content of links to those pages. 
This is the same *data* people see and use. It is *not* metadata. The 
semantic web assumes two separate layers of data and metadata, and 
their is no guarantee or even likelihood that they will be in sync. 
Sites can and will lie with their metadata. Google, by contrast, is 
relatively resistant to lying. Since it sees only what actual users 
see, its index is relatively accurate.

(Yes, there are a few sites that try to lie to Google by serving it 
different pages for the same URLs, but that's much harder than lying 
with keywords, and those few sites can be easily banned from Google.)

-- 

   Elliotte Rusty Harold
   elharo@metalab.unc.edu
   Effective XML (Addison-Wesley, 2003)
   http://www.cafeconleche.org/books/effectivexml
   http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA

Follow-Ups:
- Re: [xml-dev] Semantic Web permathread, iteration n+1 (was Re: [xml-dev]InfoWorld agrees with Elliote Rusty Harold)
  - From: "Thomas B. Passin" <tpassin@comcast.net>
- RE: [xml-dev] Semantic Web permathread, iteration n+1 (was Re: [xml-dev] InfoWorld agrees with Elliote Rusty Harold)
  - From: Elliotte Rusty Harold <elharo@metalab.unc.edu>

References:
- RE: [xml-dev] Semantic Web permathread, iteration n+1 (was Re: [xml-dev] InfoWorld agrees with Elliote Rusty Harold)
  - From: "Joshua Allen" <joshuaa@microsoft.com>

Prev by Date: Re: [xml-dev] Semantic Web permathread, iteration n+1 (was Re: [xml-dev]InfoWorld agrees with Elliote Rusty Harold)
Next by Date: Re: [xml-dev] Semantic Web permathread, iteration n+1 (was Re:[xml-dev] InfoWorld agrees with Elliote Rusty Harold)
Previous by thread: RE: [xml-dev] Semantic Web permathread, iteration n+1 (was Re: [xml-dev] InfoWorld agrees with Elliote Rusty Harold)
Next by thread: RE: [xml-dev] Semantic Web permathread, iteration n+1 (was Re: [xml-dev] InfoWorld agrees with Elliote Rusty Harold)
Index(es):
- Date
- Thread