iEntry 10th Anniversary Forum Rules Search
WebProWorld
Register FAQ Calendar Mark Forums Read
Google Discussion Forum Google Discussion forum is for topics specifically related to Google. There is a subforum dedicated to AdSense/AdWords subjects.

Share Thread: & Tags

Share Thread:

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 06-06-2008, 06:58 AM
WebProWorld New Member
 
Join Date: May 2008
Location: Essex, UK
Posts: 11
wedshop_master RepRank 0
Default Mysterious entries in robots.txt reported by Webmaster Tools

Over the last year I have enjoyed watching all of my 'web crawl errors' disappear from the Webmaster Tools 'overview screen'.

A few days ago I logged in to find that I had gone from having 1 page not found, and 1 page restricted by robots.txt,
to 4 pages not found and 90 pages restricted by robots.txt, and bizarrely none of the entries have ever been in my robots.txt file,
and none of the 'pages not found' are actually pages that have ever existed!

It doesn't seem as though any of this is damaging, but it did make me think that the site had been compromised somehow.
All of the pages restricted in robots.txt are mobile phone related (nothing to do with my bridal themed site of course).

Here is a screen grab of the first few entries:


Reply With Quote
  #2 (permalink)  
Old 06-06-2008, 10:26 AM
WebProWorld Veteran
 
Join Date: Jul 2004
Posts: 913
activeco RepRank 2
Default Re: Mysterious entries in robots.txt reported by Webmaster Tools

It seems you wrongly implemented custom 404 code.
Chack this page: How to Set Up a Custom 404 File Not Found Page on your Website (thesitewizard.com)
__________________
Impossible? You just underestimate the time.
Reply With Quote
  #3 (permalink)  
Old 06-06-2008, 10:38 AM
wige's Avatar
Moderator
WebProWorld Moderator
 
Join Date: Jun 2006
Location: United States
Posts: 2,648
wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9
Default Re: Mysterious entries in robots.txt reported by Webmaster Tools

My guess, you have been hacked. I have seen a few attacks very similar to what you are showing. Have your hosting company pull your FTP logs and check for any anomalous activity. Also, go through every single folder on your site and grab the .htaccess files. Look for mod_rewrite RedirectRules that have been added, as that is the most frequent method used in this type of attack - it allows the attacker to hide a single file somewhere you are not likely to check and then create internal redirects to that file.

Also, check the lastmod date of the robots.txt file, and search for all files on your server that have a lastmod within 6 hours of that time, as those files may have also been altered. You need to do this through a command line (SSH connection) not through FTP, which may change the date. If you don't have secure telnet access to the server, have your hosting company compile the listing.

I did notice that these entries do not show up if I look at your robots.txt file. Have you uploaded a clean file, or are these entries not actually there? If the entries are not there, the attacker may be using a mod_rewrite rule to serve Google with a different robots.txt file. The only way to confirm this would be to sign in to your Webmaster Tools account and have Google analyze the robots.txt file.
__________________
The best way to learn anything, is to question everything.
Reply With Quote
  #4 (permalink)  
Old 06-11-2008, 07:46 AM
WebProWorld New Member
 
Join Date: May 2008
Location: Essex, UK
Posts: 11
wedshop_master RepRank 0
Default Re: Mysterious entries in robots.txt reported by Webmaster Tools

Quote:
Originally Posted by activeco View Post
It seems you wrongly implemented custom 404 code.
Chack this page: How to Set Up a Custom 404 File Not Found Page on your Website (thesitewizard.com)
Thanks activeco.

Non-existent pages are returning 'HTTP Status Code: HTTP/1.1 404'.

So the 404 code appears to be doing the job.
Reply With Quote
  #5 (permalink)  
Old 06-11-2008, 08:09 AM
WebProWorld New Member
 
Join Date: May 2008
Location: Essex, UK
Posts: 11
wedshop_master RepRank 0
Default Re: Mysterious entries in robots.txt reported by Webmaster Tools

Quote:
Originally Posted by wige View Post
My guess, you have been hacked. I have seen a few attacks very similar to what you are showing. Have your hosting company pull your FTP logs and check for any anomalous activity. Also, go through every single folder on your site and grab the .htaccess files. Look for mod_rewrite RedirectRules that have been added, as that is the most frequent method used in this type of attack - it allows the attacker to hide a single file somewhere you are not likely to check and then create internal redirects to that file.

Also, check the lastmod date of the robots.txt file, and search for all files on your server that have a lastmod within 6 hours of that time, as those files may have also been altered. You need to do this through a command line (SSH connection) not through FTP, which may change the date. If you don't have secure telnet access to the server, have your hosting company compile the listing.

I did notice that these entries do not show up if I look at your robots.txt file. Have you uploaded a clean file, or are these entries not actually there? If the entries are not there, the attacker may be using a mod_rewrite rule to serve Google with a different robots.txt file. The only way to confirm this would be to sign in to your Webmaster Tools account and have Google analyze the robots.txt file.

Thanks wige, this has been very useful advice because although I didn't find anything suspicious,
I wasn't aware of the role the .htaccess file plays, and discovered it had a double entry for my 404 CGI script.

My robots.txt file is clean, and as far as I know always has been, but Google Webmaster Tools is now reporting 236 restricted URLs, again all related to mobile phone ringtones.
All of the mystery URLs are within a folder called 'ringtones' in my 'images' folder which is restricted in my robots.txt.

Last edited by wedshop_master; 06-11-2008 at 08:13 AM.
Reply With Quote
  #6 (permalink)  
Old 06-11-2008, 08:14 AM
WebProWorld Veteran
 
Join Date: Jul 2004
Posts: 913
activeco RepRank 2
Default Re: Mysterious entries in robots.txt reported by Webmaster Tools

Quote:
Originally Posted by wedshop_master View Post
Thanks activeco.

Non-existent pages are returning 'HTTP Status Code: HTTP/1.1 404'.

So the 404 code appears to be doing the job.
You were right. My wrong thinking.
However, there are many links to the illegal pages: "theweddingshop.co.uk/images/ringtones/ - Google Search
My thought was about returning wrong 404 for them, but robots.txt didn't make much sense in that scenario.

Some sort of hacking is possible as no page returned was cached: site:theweddingshop.co.uk/images/ringtones/ - Google Search, but again, it could be due to the robots.txt exclusion (?).

P.S. I just found out that you indeed have /images/ excluded in robots.txt, so Google was technically right. The chance is good that competition is trying to hurt your site by bad linking.
__________________
Impossible? You just underestimate the time.

Last edited by activeco; 06-11-2008 at 09:13 AM.
Reply With Quote
  #7 (permalink)  
Old 06-11-2008, 10:11 AM
WebProWorld Pro
 
Join Date: Dec 2007
Location: Brussels, Belgium
Posts: 164
Jean-Luc RepRank 2
Default Re: Mysterious entries in robots.txt reported by Webmaster Tools

As activeco wrote, your robots.txt disallows access to these URL's. Google found these URL's in other websites where they had been posted by a spammer.

It is likely that the spammer did that, because he thought he was able to control these pages on your website. It looks like the spammer is wrong and that his efforts here are pointless. Maybe he is using hacking tools that are buggy. Maybe he is stupid and he was not aware of your present robots.txt file.

You should doublecheck that each and every file on your server is clean. This situation is not going to hurt you in any way as long as the spammer has no control on your server.

Jean-Luc
__________________
Checking redirects made easy | | Professional AWStats Services

Last edited by Jean-Luc; 06-11-2008 at 10:13 AM.
Reply With Quote
  #8 (permalink)  
Old 06-11-2008, 12:09 PM
WebProWorld New Member
 
Join Date: May 2008
Location: Essex, UK
Posts: 11
wedshop_master RepRank 0
Default Re: Mysterious entries in robots.txt reported by Webmaster Tools

Quote:
Originally Posted by Jean-Luc View Post
As activeco wrote, your robots.txt disallows access to these URL's. Google found these URL's in other websites where they had been posted by a spammer.

It is likely that the spammer did that, because he thought he was able to control these pages on your website. It looks like the spammer is wrong and that his efforts here are pointless. Maybe he is using hacking tools that are buggy. Maybe he is stupid and he was not aware of your present robots.txt file.

You should doublecheck that each and every file on your server is clean. This situation is not going to hurt you in any way as long as the spammer has no control on your server.

Jean-Luc
Thanks Jean-Luc

There are loads of horrid spammy pages out there with dead links to my site now (I've used Webmaster Tools to remove the ringtone directory and any of its subdirectories and files).
I really hope (and would assume) that this won't damage my ranking because obviously anyone can easily do this?
Reply With Quote
  #9 (permalink)  
Old 06-11-2008, 01:09 PM
WebProWorld New Member
 
Join Date: May 2008
Location: Essex, UK
Posts: 11
wedshop_master RepRank 0
Default Re: Mysterious entries in robots.txt reported by Webmaster Tools

Quote:
Originally Posted by activeco View Post
The chance is good that competition is trying to hurt your site by bad linking.
Thanks activeco, I'm beginning to wonder if this is the case because we've just started to get decent top 5 rankings for our important searches. Maybe someone has hired a 'web hit man'?
Reply With Quote
Reply

  WebProWorld > Search Engines > Google Discussion Forum

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Google Adds DIY Robots.txt Creator To Webmaster Tools TrafficProducer Google Discussion Forum 0 04-03-2008 05:54 AM
Want to Know More About Google's Webmaster Tools? mjtaylor Google Discussion Forum 0 10-03-2007 03:53 PM
Google Webmaster Tools ArthurKay Google Discussion Forum 12 08-12-2007 10:22 AM
WebmasterBrain.com : Webmaster/SEO Tools webmasterbrain Submit Your Site For Review 6 04-29-2004 08:15 PM


All times are GMT -4. The time now is 09:45 AM.



Search Engine Optimization by vBSEO 3.3.0