Lists Home |
Date Index |
At 11:04 AM -0700 6/3/04, Joshua Allen wrote:
>By the same token, if I want to get a new field of metadata published in
>Google's index, it's even more difficult. I don't even know who to pay.
>The engine indexes things like pagerank, related pages, keywords. But
>if I want any other information in there, I have to get a job as a
>janitor at Google's offices and hack the source code. Now, if I build
>my own engine for storing additional metadata like annotations, ratings,
>etc., I have to build adapters (and probably violate some EULA) to get
>that to integrate with Google or MSN Search. These systems are all
The competition is not between two different metadata systems, one of
which is open and one of which is closed. Google does not provide
metadata. It provides answers to questions. The semantic web attempts
to provide answers to questions by having authors annotate their
content with metadata. Google skips the metadata entirely. This is
another reason I think Google is fundamentally superior to the vision
of the semantic web.
Google is based on what web sites present to people: the actual text
content of the pages and the text content of links to those pages.
This is the same *data* people see and use. It is *not* metadata. The
semantic web assumes two separate layers of data and metadata, and
their is no guarantee or even likelihood that they will be in sync.
Sites can and will lie with their metadata. Google, by contrast, is
relatively resistant to lying. Since it sees only what actual users
see, its index is relatively accurate.
(Yes, there are a few sites that try to lie to Google by serving it
different pages for the same URLs, but that's much harder than lying
with keywords, and those few sites can be easily banned from Google.)
Elliotte Rusty Harold
Effective XML (Addison-Wesley, 2003)