Quote:
Originally Posted by mjtaylor
What would you recommend on editing robots.txt? I have been told to disallow archives ... do you agree? What else?
|
There are several folders/pages that should be disallowed. I just created for our members here a robots.txt that can be implemented right away for Wordpress without problems. And it will help you to get rid of most, if not all of pages found in Googles the supplemental results, which cause a harmful leak of
PR.
User-agent: Googlebot
Disallow: /*?*
Disallow: /*?
Disallow: /*.php$
Disallow: /*.js$
Disallow: /*.inc$
Disallow: /*.css$
Disallow: /*.gz$
Disallow: /*.wmv$
Disallow: /*.tar$
Disallow: /*.tgz$
Disallow: /*.cgi$
Disallow: /*.xhtml$
Disallow: /category/*/*
Disallow: */comments
Disallow: /feed/$
Disallow: /*/feed/$
Disallow: /*/*/feed/$
Disallow: /*/*/*/feed/$
Disallow: /*/feed/rss/$
Disallow: /*/*/feed/rss/$
Disallow: /*/*/*/feed/rss/$
Disallow: */trackback
Disallow: /*/trackback/$
Disallow: /*/*/trackback/$
Disallow: /*/*/*/trackback/$
Allow: /wp-content/uploads
User-agent: *
Disallow: /cgi-bin
Disallow: /wp-
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-register.php
Disallow: /wp-login.php
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /feed/
Disallow: /trackback/
Disallow: /rss/
Disallow: /comments/feed/
Disallow: /page/
Disallow: /date/
Disallow: /comments/
Disallow: /search
Disallow: /stats/
Disallow: /contact/
Disallow: /about/
Disallow: /archives/
Disallow: /dh_
There are probably more folders or files that might need to be disallowed. For example as you mentioned, the archives.
I hope this is useful for you and others here.