Re: [xml-dev] Can Searchbots Find Web Pages That Aren't Linked To?

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: "Manfred Staudinger" <manfred.staudinger@gmail.com>
To: "Pete Cordell" <petexmldev@codalogic.com>
Date: Sun, 9 Mar 2008 15:37:37 +0100

>  From: "Costello, Roger L." <costello@mitre.org>
>
>  I was interested in knowing if searchbots can find web pages that
>  aren't link to.
>
>  So, I conducted a simple experiment:
You asked a simple question and you got a simple answer which is
correct but doesn't really help. The discussion about sitemaps reveals
a more complex picture:
http://code.google.com/support/bin/answer.py?hl=en-uk&answer=40318

a) Rather than asking "if searchbots can find" a web page, I would ask
whether the searchbot has actually found and crawled a web page. This
can be answered by looking at the server log files. If a searchbot
_has_ crawled a web page it may include it in the index (sooner or
later) _or_ not.

b) You are offering 2 pages with duplicate content, one xhtml and one
xml. Now the best choice a search engine can make is to index both but
show only the xhtml to the user. An option to show even the 2nd page
would be perfect ("repeat the search with the omitted results
included.")

Hope this helps,

Manfred

References:
- Can Searchbots Find Web Pages That Aren't Linked To?
  - From: "Costello, Roger L." <costello@mitre.org>
- Re: [xml-dev] Can Searchbots Find Web Pages That Aren't Linked To?
  - From: "Pete Cordell" <petexmldev@codalogic.com>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]