If you are on Apache, and .htaccess modules are activated, keep bad bots out of your site, adding the following rules:
Code:
### Deny Fake Bots ###
BrowserMatch "^Java/?[1-9_\.]*" bad_bot
BrowserMatch "^MJ12bot/?[1-9_\.]*" bad_bot
SetEnvIfNoCase User-Agent "8484 Boston Project v 1.0" bad_bot
SetEnvIfNoCase User-Agent "charlotte/" bad_bot
SetEnvIfNoCase User-Agent "curl/7.15.5 (i686-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5" bad_bot
SetEnvifNoCase User-Agent "ISC Systems iRc Search 2.1" bad_bot
SetEnvIfNoCase User-Agent "^Jakarta\ Commons-HttpClient/" bad_bot
SetEnvIfNoCase User-Agent "larbin/" bad-bot
SetEnvIfNoCase User-Agent "libwww-perl/" bad_bot
SetEnvIfNoCase User-Agent "^libcurl-agent/" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft\ URL\ Control.*$" bad_bot
SetEnvIfNoCase User-Agent "MJ12bot/v1.0.8" bad_bot
SetEnvIfNoCase User-Agent "^Missigua" bad_bot
SetEnvIfNoCase User-Agent "^Mozilla/4\.0\ .*Win\ 9x\ 4\.90.*$" bad_bot
SetEnvIfNoCase User-Agent "Nutch" bad_bot
SetEnvIfNoCase User-Agent "phpversion" bad_bot
SetEnvIfNoCase User-Agent "TencentTraveler" bad_bot
SetEnvIfNoCase User-Agent "^Web Downloader" bad_bot
<FilesMatch "(.*)">
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</FilesMatch>
and
Code:
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ADSARobot|ah-ha|almaden|aktuelles|Anarchie|amzn_assoc|Arachmo|ASPSeek|ASSORT|ATHENS|Atomz|attach|attache|autoemailspider|BackWeb|Bandit|BatchFTP|bdfetch|BecomeBot|big.brother|BlackWidow|bmclient|Boston\ Project|bot/1.0|BravoBrian\ SpiderEngine\ MarcoPolo|Bot\ mailto:craftbot@yahoo.com|Buddy|Bullseye|bumblebee|capture|CherryPicker|ChinaClaw|CICC|clipping|Clushbot|Collector|Copier|Crescent|Crescent\ Internet\ ToolPak|Custo|cyberalert|Deweb|diagem|Digger|Digimarc|DIIbot|DISCo|DISCo\ Pump|DISCoFinder|Download\ Demon|Download\ Wonder|Downloader|Drip|DSurf15a|DTS.Agent|EasyDL|eCatch|ecollector|efp@gmx\.net|Email\ Extractor|EirGrabber|email|EmailCollector|EmailSiphon|EmailWolf|Express\ WebPictures|ExtractorPro|EyeNetIE|FavOrg|fastlwspider|Favorites\ Sweeper|Fetch|FEZhead|FileHound|FlashGet\ WebWasher|FlickBot|fluffy|FrontPage|GalaxyBot|Generic|Getleft|GetRight|GetSmart|GetWeb!|GetWebPage|gigabaz|Girafabot|Go\!Zilla|Go!Zilla|Go-Ahead-Got-It|GornKer|gotit|Grabber|GrabNet|Grafula|Green\ Research|grub-client|Harvest|hhjhj@yahoo|hloader|HMView|HomePageSearch|http\ generic|HTTrack|httpdown|httrack|ia_archiver|IBM_Planetwide|Image\ Stripper|Image\ Sucker|imagefetch|IncyWincy|Indy*Library|Indy\ Library|informant|Ingelin|InterGET|Internet\ Ninja|InternetLinkagent|Internet\ Ninja|InternetSeer\.com|Iria|Irvine|JBH*agent|JetCar|JOC|JOC\ Web\ Spider|JustView|kalooga|KWebGet|Lachesis|larbin|Leacher|LeechFTP|LexiBot|lftp|libwww|likse|Link|Link*Sleuth|LINKS\ ARoMATIZED|LinkWalker|LWP|lwp-trivial|Mag-Net|Magnet|Mac\ Finder|Mag-Net|Mass\ Downloader|MCspider|MJ12bot/v1\.0\.8|Memo|Microsoft.URL|MIDown\ tool|Mirror|Missigua\ Locator|Mister\ PiX|MMMtoCrawl\/UrlDispatcherLLL|^Mozilla$|Mozilla.*Indy|Mozilla.*NEWT|Mozilla*MSIECrawler|MS\ FrontPage*|MSFrontPage|MSIECrawler|MSProxy|MSR-ISRCCrawler|multithreaddb|my-heritrix-crawler|nationaldirectory|Navroad|NearSite|NetAnts|NetCarta|NetMechanic|netprospector|NetResearchServer|NetSpider|Net\ Vampire|NetZIP|NetZip\ Downloader|NetZippy|NEWT|NICErsPRO|Ninja|NPBot|NicheBot|noxtrumbot|Octopus|Offline\ Explorer|Offline\ Navigator|OpaL|Openfind|OpenTextSiteCrawler|OrangeBot|PageGrabber|Papa\ Foto|PackRat|pavuk|pcBrowser|PersonaPilot|Ping|PingALink|Pingdom|Pockey|POE-Component-Client-HTTP|Powermarks|Proxy|psbot|PSurf|psycheclone|puf|Pump|PushSite|QRVA|RealDownload|Reaper|Recorder|ReGet|replacer|RepoMonkey|Robozilla|Rover|RPT-HTTPClient|Rsync|Scooter|SearchExpress|searchhippo|searchterms\.it|Second\ Street\ Research|Seeker|Shai|Siphon|sitecheck|sitecheck.internetseer.com|SiteSnagger|SlySearch|SmartDownload|snagger|Snake|SpaceBison|Spegla|SpiderBot|sproose|SqWorm|Stripper|Sucker|SuperBot|SuperHTTP|Surfbot|SurfWalker|Szukacz|tAkeOut|tarspider|Teleport\ Pro|Templeton|TrueRobot|TV33_Mercator|UIowaCrawler|UtilMind|URLSpiderPro|URL_Spider_Pro|Vacuum|vagabondo|vayala|visibilitygap|VoidEYE|vspider|Web\ Downloader|w3mir|Web\ Data\ Extractor|Web\ Image\ Collector|Web\ Sucker|Wweb|WebAuto|WebBandit|web\.by\.mail|Webclipping|webcollage|webcollector|WebCopier|webcraft@bea|webdevil|webdownloader|Webdup|WebEMailExtrac|WebFetch|WebGo\ IS|WebHook|Webinator|WebLeacher|WEBMASTERS|WebMiner|WebMirror|webmole|WebReaper|WebSauger|Website|Website\ eXtractor|Website\ Quester|WebSnake|Webster|WebStripper|websucker|webvac|webwalk|webweasel|WebWhacker|WebZIP|Wget|Whacker|whizbang|WhosTalking|Widow|WISEbot|WWWOFFLE|x-Tractor|^Xaldon\ WebSpider|WUMPUS|Xenu|XGET|Yeti|zermelo|Zeus.*Webster|Zeus [NC]
RewriteRule ^.* - [F,L]
Try both together. If you get a server error, test each one separately to see which works.
I did not add the bot your mentioned here, since I did not investigate it yet.
In addition, do yourself a favor and support us at Distributed Spam Harvester Tracking Network | Project Honey Pot (Free - No membeship fees).
I can only tell that we have 98% less spambots attacks, and we catch some if not all of the left 2% with the help of the honeypot.
You will be amazed.
Good luck,
John
P.S. I am writing an article which I will publish soon on my site.