PDA

View Full Version : Google is finding Old Pages in my GWT



watto
11-22-2010, 10:28 PM
I have noticed over the past few days Google is showing in my GWT quite a lot of my old pages which haven't even existed in over a year.

http://www .business-trader.com.au/australian_business_classifieds_1000_3.html

What is the best way for me to deal with this? I have a couple of ideas but would love some other opinions.

Thanks in advance!

Dinghus
11-23-2010, 12:33 PM
This is a tricky problem and it seems to be happening more and more with Google. I just had a person buy a product I haven't sold in over a year and didn't even have it around anymore. (Software) I had dumped it and purged it and then I get an order for it. Amazingly the link to pay through PayPal still worked. This caused a bit of consternation on the customer's side. I worked with him to find out where he had found my link to the page that no longer existed. It was on google. It was cached. Should not have been cached since I make sure product pages have the nocache enabled.
Not sure how to get rid of old pages like this when Google hangs onto them for some reason. Of course what cracked me up is that when it was live it never ranked on the 1st page but the cached page was #3. Go figger.

ron angel
11-23-2010, 01:14 PM
I have noticed over the past few days Google is showing in my GWT quite a lot of my old pages which haven't even existed in over a year.

http://www .business-trader.com.au/australian_business_classifieds_1000_3.html

What is the best way for me to deal with this? I have a couple of ideas but would love some other opinions.

Thanks in advance!

I too have seen this happen although I do not sell anything my page not found entrys on my log file show people clicking on links to pages I removed over a year ago. They should concentrate on current stuff first before going to old out of date cached files.

e-dvertising
11-23-2010, 03:21 PM
GWT is the abreviation for "Google Webmaster Tools" which is a toolset located at https://www.google.com/webmasters/tools/ where - if the site is authenticated to your google-account - you can find a lot of stats and info what and how google-crawling your site (or sites).

My suggestion for getting rid of old content is the following:
1. check your stats (serverside or google analytics) for "old and outdated" data and make sure that it isn't reachable online (sometimes the CMS - content management systems - just remove the links to the section/products but it is still live in the old places and so reachable with external links or bookmarks etc.).
2. check the section "Crawl errors" (subsection of "Diagnostics") in your GWT and check there for old an outdated data ... if it's removed from yout site it shold give a "404 (Not found)" in the "Detail"-info.
3. to force the revmoval of old content got to "Site configuration" -> "Crawler access" -> "Remove URL" and make "Removal requests" for those URLs to get rid of cached data as quick as possible.

So far for my suggestions,

greetings from Austria.

computergenius
11-23-2010, 03:33 PM
Could you not use the old pages by doing 301 forwards to somewhere similar but more useful?

williamc
11-23-2010, 03:38 PM
Could you not use the old pages by doing 301 forwards to somewhere similar but more useful?

I would agree. If Google is going to make mistakes, there is no reason that you can not still benefit from those mistakes.

e-dvertising
11-23-2010, 03:40 PM
Could you not use the old pages by doing 301 forwards to somewhere similar but more useful?

Yes, that's right, this is - or should be - "basic" work for a change of URLs which could be done with .htaccess-commands for example, but besides "good URLs never change" ... if you have products or content which is outdatet you might not have a 301-destination for and then your should get rid of these URLs.

watto
11-23-2010, 04:11 PM
Thanks for the feedback guys. The pages in question were database driven page and GWT are saying that there over 1084. As you can see these urls are html and now my site is completely php.

Individually doing a 301 on these 404 pages would be very time consuming. These pages are not indexed by Google.

williamc
11-23-2010, 04:14 PM
Individually doing a 301 on these 404 pages would be very time consuming. These pages are not indexed by Google.

Yeh, if they are not showing as indexed, I would not worry about it much, but just for the record, if the new site uses .php extensions exclusively, doing a 301 on all .html would be extremely easy and just a couple lines of rewrite code.

watto
11-23-2010, 04:26 PM
the problem is I already have a bunch of other individual old html urls which I have 301 redirected to their appropriate php url. If I did a 301 for all html urls to redirect to php, wouldn't this over ride my existing 301's?

williamc
11-23-2010, 04:30 PM
Mod_Rewrite is a top down thing. As long as your old rewrites were above the new catchall, they would still function normally.

watto
11-23-2010, 04:44 PM
Ok great. Thanks for the tips. I'll definitely look further into this, but like you said, as long as they are not indexed, then I shouldn't sweat it.

computergenius
11-23-2010, 05:00 PM
>Individually doing a 301 on these 404 pages would be very time consuming.

Get your 404 page to send you an email each time a page is accessed, then you just need to add one line to your .htaccess (or even your 404 - "if the requested URL was xxx then 301 to yyy")

>wouldn't this over ride my existing 301's?

Additively, if you do it correctly. If your first 301 sends you to a 404, the second 301 will send you to a useful page. If you don't have a useful similar page, you might as well just 301 to your home page (although I don't know if that is a good idea from the search engine side)

Your existing 301s should continue to function.

Clint1
11-24-2010, 10:41 AM
I have noticed over the past few days Google is showing in my GWT quite a lot of my old pages which haven't even existed in over a year.

http://www .business-trader.com.au/australian_business_classifieds_1000_3.html

What is the best way for me to deal with this? I have a couple of ideas but would love some other opinions.

Thanks in advance!
For me they will go back as long as TEN years+. Sometimes they are even URL's that never even existed. For URL's that did once exist, the Gbot is probably finding them from old webpages on the net that once linked to you. I always do 301's from those type pages to some of my valid webpages.

watto
11-24-2010, 04:30 PM
So no GWT is telling me I have over 2000 of these non indexed, old urls. If I wanted to do a 301 on all of these old urls and point them to www .business-trader.com.au/australian_business_classifieds.php, how would I do this? Can someone show me the code I could try? Thanks in advance.

williamc
11-24-2010, 05:08 PM
watto: just to be sure. There are NO real .html fils on the server at this time?

If not then this should work fine:

You will want to place this AFTER any existing rules, so it does not interfere with your existing rules redirecting .html extensions elsewhere.




RewriteRule ^(.*)\.html$ http://www.business-trader.com.au/australian_business_classifieds.php [R=301]

watto
11-24-2010, 05:12 PM
that is correct. Not one html file exist on the server.

williamc
11-24-2010, 05:15 PM
Edited above post to include the code. :)

watto
11-24-2010, 05:20 PM
Thanks buddy! I'll plug it in and see if it cleans it up. I can't thank you enough for your help.

One question, There are old html urls I did some off page link building for (a long time ago) and these urls were 301 redirected to the NEW php url ie state pages and category pages.

If I add your code to htaccess, will it over-ride this old redirects?

williamc
11-24-2010, 05:22 PM
No worries, glad I could help. :)

watto
11-24-2010, 05:25 PM
If it helps, most of the old HTML urls in my GWT look like this
http://www .business-trader.com.au/australian_business_classifieds_1856_16.html

williamc
11-24-2010, 05:27 PM
One question, There are old html urls I did some off page link building for (a long time ago) and these urls were 301 redirected to the NEW php url ie state pages and category pages.

If I add your code to htaccess, will it over-ride this old redirects?

Won't affect them at all as long as, like i said in post above, you put my redirect AFTER all others.


If it helps, most of the old HTML urls in my GWT look like this
http://www .business-trader.com.au/australian_business_classifieds_1856_16.html

The one I gave you will catch all .html urls, even ones that show as being in directories.

watto
11-24-2010, 05:34 PM
Won't affect them at all as long as, like i said in post above, you put my redirect AFTER all others.

Yes you are correct. I added your code and tested old redirects. Perfect! I was also able to clean out my .htaccess file with this code.

Thanks again williamc! I'll update once I start to see things settle down in my GWT.