|
|
||||||
|
||||||
| Index Link To US Private Messages Archive FAQ RSS | ||||||
| Search Engine Optimization Forum SEO is much easier with help from peers and experts! The WebProWorld SEO forum is for the discussion and exploration of various search engine optimization topics. Any non (engine) specific SEO or SEM topics should go here. |
Share Thread: & Tags
|
||||
|
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
I've been reading up on the use of the Crawl-delay syntax in the robots.txt file. I have a potential client who is employing this method. Below is a snippet of their robots.txt file:
User-agent: Slurp Crawl-delay: 20 User-agent: msnbot Crawl-delay: 20 User-agent: YahooSeeker Crawl-delay: 20 I'm wondering the following: 1 - Has anyone ever used this and had success? 2 - Has it had any impact spiders indexing your site? 3 - Isn't a crawl-delay of 20 a little excessive? 4 - Do you think I should recommend the client moves to a new hosting provider who can handle the spider traffic? Any other information on the use of the crawl-delay would be greatly appreciated. Thanks!
__________________
Let BizWonk handle your Custom Web Design, Search Engine Optimization and Social Media Marketing Needs. |
|
|||
|
Never used it, but that does seem like a lifetime as far a delay and the server being able to handle it.
If they really need that delay, I hope they don't have human visitors! ;) Just kidding. Are you sure that is needed? |
|
|||
|
Personally....I don't think they need that big of a delay...BUT then again we are not hosting the website. Not sure if the hosting company can put some type of bandwidth limitation on the client's site. Afterall it is a shopping site. BUT if they did put limitations on the amount of traffic at any given time wouldn't they lose customers?
__________________
Let BizWonk handle your Custom Web Design, Search Engine Optimization and Social Media Marketing Needs. |
|
||||
|
Hi PunkyLZ,
Before switching, as I would at seeing the robots.txt file ask why they are doing it. From other forums it is against all that I have read. If it is working, then if we have some idea why, then this info can be passed on to others. I just say this because every site want the spiders to be there in the first place. As said above is this a hosting problem? Ask the questions.
__________________
Keimos - Always learning something new each day www.keimos.co.uk , www.keimos.net , www.selfpacedit.co.uk |
|
|||
|
I do not see this as causing any problems. All you are doing is telling the visiting spider to wait 20 seconds between indexing each link it finds on your page. I can see this as a good thing for some of the more aggressive spiders. There must have been just cause for some of them to include it in their bots, maybe on very busy sites it can make a difference in bandwidth usage during the spiders visit.
|
|
||||
|
Hi PunkyLZ
Yahoo says: You can add a "Crawl-delay: xx" instruction, where "xx" is the minimum delay in seconds between successive crawler accesses. If the crawler rate is a problem for your server, you can set the delay up to 60 or 300 or whatever value is comfortable for your server. Setting a crawl-delay of 20 seconds for Yahoo! Slurp would look something like: User-agent: Slurp Crawl-delay: 20 Since most of the major Search Engines are using this instruction why not wildcard the user-agent and let the smaller Search Engines play catch up. That will save you a lot of work. You can validate your robotstxt.file here: http://www.searchengineworld.com/cgi-bin/robotcheck.cgi
__________________
GSO http://www.GlobalSpecialOperations.com/ ------------------------------------- |
|
||||
|
Some people on certain servers have found Yahoo's Slurp in partuicular to be quite greedy about bandwidth and this can cause problems with other bots and human visitors -- for those sites, the crawl-delay instruction is a good idea, although I'd probably be more inclined to use 5 or 10 as a max.
I would not recommend using the wildcard as suggested above. First, Google hits pages about one per second or so and thus doesn't create the problem that Slurp does. Why slow down bots that are already behaving? Second, I don't believe it's true that all spiders even recognize the limiting instruction.
__________________
Psychology Mental Health & Self-Help Forum Online Counseling & Therapy | Mental Health Directory |
|
||||
|
I agree that some people whose web sites are listed on Google and Yahoo and have those search engine robots coming in everyday and using up their bandwidth because they update everyday and have 190K page views per month and 80 Gigs per month of bandwidth provided by their web hosting service should be concerned about the rate that those bots request pages after spending all that time to get listed and would really like to risk that the robots will take the path of least resistance and leave their site for an easier one.
__________________
GSO http://www.GlobalSpecialOperations.com/ ------------------------------------- |
|
||||
|
There's no need to try to discourage the bots from spidering your site. You just slow them down a bit.
That's exactly what the crawl-delay instruction does: slows them down to once every 2 or 3 or 5 seconds. You still get the pages indexed that way, which i assume is a desirable thing, no?
__________________
Psychology Mental Health & Self-Help Forum Online Counseling & Therapy | Mental Health Directory |
|
||||
|
I agree that the name of the game is to do what is necessary to get the robots to crawl your site especially new pages for listing. I just don't understand why you would want to try to control an already controlled robot with a default delay written in the program based on many criterias including server load at the time the bot begins loading pages. If you check your logs and statistics reports, you can find any rogue robots and then deal with them on an individual basis but to list them all is, in my opinion, not a good practice.
__________________
GSO http://www.GlobalSpecialOperations.com/ ------------------------------------- |
|
||||
|
Quote:
For other smaller bots that misbehave, they can just be banned if you like. But it seems to me to be SE suicide to ban one of the big three.
__________________
Psychology Mental Health & Self-Help Forum Online Counseling & Therapy | Mental Health Directory |
![]() |
|
| Thread Tools | |
| Display Modes | |
|
|
|
WebProWorld |
Advertise |
Contact Us |
About |
Forum Rules |
MVP's |
Archive |
Newsletter Archive |
Top |
WebProNews
WebProWorld is an iEntry, Inc. ® site - © 2009 All Rights Reserved Privacy Policy and Legal iEntry, Inc. 2549 Richmond Rd. Lexington KY, 40509 |