[quote="edhan"]I am already doing that. Thanks everybody.Originally Posted by Andilinks
[quote="edhan"]I am already doing that. Thanks everybody.Originally Posted by Andilinks
I've never really looked into blocking bad bots as it's not that much of a problem on the sites i've worked on but i'm sure you can also block by ip ranges. So you could do primary checks by user agents and any that get through could be checked by ip address. Still not 100% foolproof tho...Still, this ain't gonna keep'em all away because a really mean bot might shed its skin and fake the UserAgent anyway...
I had to put an IP blocking scheme in my blog's comment folder to block some spambots - its a PITA - but you have to have several layers to get them
I use for my blog "Spam Karma 2", and I am very happy. :)Originally Posted by solecist
hi there,
just to be onb the safe side, andilinks said that adding that piece of code to htaccess will help avoid 'bad bot'
RewriteEngine on
SetEnvIf User-Agent ^FunWebProducts bad_bot=1
deny from env=bad_bot
by substituting FunWebProducts with the user-agent name? Do I understand correctly?
So, if I want to avoid '8484 Boston Project v 1.0 1836' then I would write
RewriteEngine on
SetEnvIf User-Agent ^8484 Boston Project v 1.0 1836 bad_bot=1
deny from env=bad_bot
??
thanks
The "^" symbol indicates "begins with," so "^f" would block all user-agents beginning with "f." "^fun" would block all user-agents that begin with "fun" and so on, getting increasingly selective with additional characters. I'm not sure what a "space" character would do here, maybe someone who is more familiar with the Apache mod-rewrite coding can jump in here with the answer to that.
Like I said above the server is very unforgiving of errors so it would pay to be careful. I have successfully used this code without any spaces in the user-agent name.
Adding lines to the .htaccess file does cause additional processing time for every page served so it would not be wise to just block a wholesale list of bots as suggested in the original post. It is better to watch for specific bots that misbehave and block them individually. I have successfully run with an .htaccess file as large as 24K but not only does it slow down delivery, too many lines can unbalance and crash the server so it pays to be very careful with it. I currently try to keep my .htaccess file under 8K.
Again, maybe a sysadmin type guy could add some more authoritative opinion here, I'm just a site owner. :)
...the Rockies may tumble, Gibralter may crumble... G & I Gershwin, 1937
thank you...
use thisOriginally Posted by Andilinks
SetEnvIfNoCase
to get also ^funwebproducts