iEntry 10th Anniversary Forum Rules Search
WebProWorld
Register FAQ Calendar Mark Forums Read
Graphics & Design Discussion Forum Post your graphics design questions/comments/ideas in here. Ask questions, post tutorials, discuss trends and best practices. Sub-forum for website accessibility and usability.

Share Thread: & Tags

Share Thread:

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 03-08-2004, 10:01 PM
WebProWorld Member
 
Join Date: Dec 2003
Location: US
Posts: 25
candlese RepRank 0
Default Robots.txt

Hi- I know what robots.txt is but I have found many answers as to what to do to stop it from being an error - but what is the best way to adress it?? And is it necessary to do so ???
Thanks!
Reply With Quote
  #2 (permalink)  
Old 03-09-2004, 11:22 AM
eightfifteen's Avatar
WebProWorld Veteran
 
Join Date: Jan 2004
Location: Des Moines, IA
Posts: 406
eightfifteen RepRank 1
Default

The first question I have is, what error?
Reply With Quote
  #3 (permalink)  
Old 03-09-2004, 01:20 PM
WebProWorld Member
 
Join Date: Dec 2003
Location: US
Posts: 25
candlese RepRank 0
Default Robots.txt

The error is in my error logs after a robot traverses the site. I know I can stop them by adding a text file with the User-agent: * Disallow: /- but I am wondering if the error is something I should rectify? Be concerned with?
I am also aware you can use the meta tags to disallow a robot- but this is not fool proof.
With so many robots on the scene it is difficult to list those you want and those you don't unless you simply say - no to all. Wondering what the best way to deal with robots.txt is? Thanks for any help
Reply With Quote
  #4 (permalink)  
Old 03-09-2004, 03:07 PM
eightfifteen's Avatar
WebProWorld Veteran
 
Join Date: Jan 2004
Location: Des Moines, IA
Posts: 406
eightfifteen RepRank 1
Default

I think you might be confusing robot.txt and spiders. Robot.txt tells spiders where to go in your site to index, and where not to go.

Here's what I have for my Robots.txt:

User-agent: *
Disallow: /cgi-bin/
Disallow: /images/

What that says to the spider is…Index everything, but bypass the cgi-bin and images directories.
Reply With Quote
  #5 (permalink)  
Old 03-09-2004, 07:30 PM
WebProWorld Member
 
Join Date: Dec 2003
Location: US
Posts: 25
candlese RepRank 0
Default robots.txt

Hi thanks- your right of course- what it should say is after a spider traverses the site I get an error on my log-robot.txt not avail-occasionally! I have used meta tags to dis-allow spiders but it does not work to avoid the error. I also have a robots.txt file- thanks for your example- this is what I have also. I meant to say spider where it says robots- sorry. I guess I will just stay with what I have- and figure its not a problem. Thanks for your help!
Reply With Quote
  #6 (permalink)  
Old 03-09-2004, 07:54 PM
paulhiles's Avatar
WebProWorld 1,000+ Club
 
Join Date: Jul 2003
Location: UK
Posts: 2,089
paulhiles RepRank 0
Default Re: robots.txt

Quote:
Originally Posted by candlese
I also have a robots.txt file- thanks for your example- this is what I have also. I meant to say spider where it says robots- sorry. I guess I will just stay with what I have- and figure its not a problem.
If you're using a robots.txt file, it should be placed in the root folder of your site. If a robot attempts to read the file at www.candlesetcetera.com/robots.txt it will produce a 404 error because the file is not present.

Paul
Reply With Quote
Reply

  WebProWorld > Site Design > Graphics & Design Discussion Forum

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -4. The time now is 09:43 PM.



Search Engine Optimization by vBSEO 3.3.0