PDA

View Full Version : meta tag question



cjshu
12-04-2003, 02:59 PM
Is this the right way to use a robot.txt tag for google on a page with affiliate links?

<META NAME="robots" content="index, nofollow">

carbonize
12-05-2003, 02:50 AM
Looks right to me. You could always just use a robots.txt file to stop the robots even visiting that page though.

paulhiles
12-07-2003, 04:22 PM
Google's own robot is imaginatively named Googlebot. Googlebot will usually look for a robots.txt file when it visits your site so take carbonize's suggestion and store your robots.txt as a separate file on the root of your server. If you are specifically targetting Google then you need to state that in the User agent line.
e.g.
User-agent: * (for all bots)
OR
User-agent: Googlebot
See here (http://www.robotstxt.org/wc/faq.html#robotstxt) for help on writing a robots.txt file

If you're on a shared server, or you don't have administration rights, then the meta tag you started with is the best solution (replacing robots with Googlebot if you only want to target Google)

<META NAME="robots" content="index, nofollow">

Further information on Googlebot can be found here (http://www.google.com/bot.html).

Hope that helps

Paul

successu
12-07-2003, 10:54 PM
Yes naming the bots by name works - Thumbs up!

This is to avoid caching

<META NAME="robots" content="index, noarchive">

You'd want to do this because it avoids bringing your old pages with broken links.

Cheers!

lrobertson
12-09-2003, 11:13 AM
I have added the following text as my robots.txt file

User-agent: *
Disallow:


I was wondering if this will cause me any problems? Should I be specifying Google or will this text allow Google to spider the page along with the other robots?

paulhiles
12-09-2003, 12:02 PM
Hi lrobertson,

User-agent: *
Disallow:

this will tell ALL robots (that refer to your robots.txt file) that there is nowhere on your server that is 'disallowed'. In other words, the robots may 'spider' or index all of your site's content.
Should you ONLY wish to target Google, you would use the following:

User-agent: Googlebot
Disallow:

Additionally, if say you have an images folder that you don't want to be spidered by the webcrawler robot, you would use the following:

User-agent: webcrawler
Disallow: /images

does that help any?

Paul

lrobertson
12-10-2003, 04:04 AM
Thanks for that feedback it was exactly what I needed to know. I will adjust the file accordingly.