PDA

View Full Version : Robots.txt... WHY



Clicken
08-19-2004, 03:03 PM
As some of you know I can ask some really bumb questions, so if this is one of them you could just ignore me.

I was at Danny Sullivans Site ( forgot the name of it) learning about search engines when I came to the robot protocol section.

I noticed that there was no mention that by listing your private file names in this list it is possible for anyone to view your robot.txt file simply by adding /robot.txt to the end of the site url in the browser address bar.

Why would you want to display the names of the files you are wanting to keep private?

I emailed Danny for an answer, but I guess he is too busy, or this is a dumb question and should just be ignored?

*******************

By the way if you have not seen my site latetly, please check it out...

Thanks!

flood6
08-19-2004, 04:33 PM
The idea behind the robots exclusion is to keep compliant robots out of certain files or directories.

Excluding duplicate content that is just a waste of your bandwidth every time a spider crawls it is one use. I have the entire ODP on my site. I exclude it in robots.txt to keep from wasting the bandwidth that the spiders would use crawling the several thousand pages pages just to find it is duplicate content.

I'm sure there are plenty other reasons.

You're right, people wanting to check out what you're trying to hide will go right to your robots.txt.

If you want to keep a person out use .htaccess or other means of password protection.