Analysing Googlebot Visits
Hi,
Going deeper in my quest to find out why googlebot is not spidering / indexing some of the pages on our site, inspite of these pages being properly linked, I decided to study what actually the googlebot is doing on our site.
I built a small software, that reads the logfiles and collects information on googlebot visits. The info it gives out is
1) Date and Time of visit
2) IP Address of Googlebot
3) requested url
4) querystring parameters
5) Referrer
After analysing last couple of months of logfiles using this software, here are some revelations.
1) Googlebot tends to take the same pages everyday.
2) Googlebot tries to submit forms with blank parameters.
3) Because of blank parameters, it runs into an error message instead of the actual contents of the page
4) It includes the error message in querystrings.
5) Googlebot does not specify the referrer, making it impossible to know what made googlebot spider this page.
6) We get about 120 pages spidered by googlebot everyday whereas mediapartners-googlebot (Which serves ads on our site) spiders about 2000 pages everyday. MSNBot spiders about 1600 pages everyday.
Are these the normal findings? or have i stumbled onto something unusual?
Thanks
Nilesh Deshpande
|