OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] When Searching With Google

[ Lists Home | Date Index | Thread Index ]

Google's current differentiation comes not from their ability to discern
meaning or provide a user interface that is better then that of the
other search engines. Instead, it is in the algorithms that figure out
the 'popularity' of the page based on how many other pages (and what
kind of pages) link to it.

By doing this, Google effectively incorporates opinions of a large set
of people. The most popular pages percolate to the top of the result
list. It is the fact that the link one is looking for is right at the
top of the list (as opposed to being buried on page 17), that creates
the perception of higher relevancy of Google search results.  

I say "current" because they also experiment with other stuff, for
example using certain taxonomies like the Open Directory Project index.
In fact, with their recent acquisition of Applied Semantics, they seem
to be very much into knowledge representation, Semantic Web approaches
to search. One evidence can be seen if you search on Google for
"Semantic Web". Notice that one of the adds served on the right is their
own "Work at Google" advertisement.

Getting back to the original question, I think the subscription search
engines that contract for the quality of their results, would be more
viable within the specific specialized domains as opposed to the general
search areas.

Regards,

Irene Polikoff
Executive Partner
TopQuadrant

Main office: 724-846-9300x212
Direct line:  914-777-0888
Cell:           914-329-8576
www.topquadrant.com

-----Original Message-----
From: Bullard, Claude L (Len) [mailto:clbullar@ingr.com] 
Sent: Monday, December 08, 2003 10:30 AM
To: 'michael.h.kay@ntlworld.com'; xml-dev@lists.xml.org
Subject: RE: [xml-dev] When Searching With Google


Right.  And that is why I am asking.  Should the GUI 
give clues to the filtering?  If yes, it gets harder 
to use.  If no, its reliability vis a vis a common 
mental model is lowered.

One should be sure what those Google numbers are 
saying.  One should know about the phrase trade. 
One should understand blogging keiretsu.  One should 
be able to set a search based on the credentials 
of the sources.  One should be able to pick the 
types of credentials, not let the bot do that.

Amplified acceptance of unverified assumptions 
is the very essence of robot wisdom.  I am 
wondering about the viability of subscription 
search engines that contract for the quality 
of their results.  

len


From: Michael Kay [mailto:michael.h.kay@ntlworld.com]

Most modern search engines give greater weight to a term the more
infrequent it is in the corpus. Most also weight terms according to
where and how often they appear in the source document, and some also
recognize when adjacent words in the query constitute a noun phrase.
What google does is anyone's guess.

-----------------------------------------------------------------
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of OASIS <http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl>





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS