We recently prevented an attack with one of the rules we use in our .htaccess file, in addition to our firewall settings and just thought of sharing. We have absolutely no bad bots on our site since we use all this:
Code:
##################################################
########## Created by John. S. Britsios ##########
########## SEO Workers Search Engine Optimization Consulting Company ##########
##################################################
##### Security settings #####
## LIMIT UPLOAD FILE SIZE TO PROTECT AGAINST DOS ATTACK by limiting file size to 0-2147483647 bytes, (2GB)###
LimitRequestBody 10240000
### Prevent .htaccess, .htpasswd and other files from being viewed by web clients ###
<FilesMatch "\.(htaccess|htpasswd|ini|phps|fla|psd|log|sh)$">
Order Allow,Deny
Deny from all
</FilesMatch>
RewriteEngine On
Options +FollowSymLinks
ServerSignature Off
RewriteCond %{REQUEST_URI} ^/(,|;|:|<|>|”>|”<|/|\\\.\.\\).{0,9999}.* [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(;|<|>|’|”|\)|%0A|%0D|%22|%27|%3C|%3E|%00).*(/\*|union|select|insert|cast|set|declare|drop|update|md5|benchmark).* [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(<|>|’|%0A|%0D|%27|%3C|%3E|%00).* [NC]
RewriteRule ^(.*)$ http://www.kissmyassyousonofabitch.com
## DENY REQUEST BASED ON REQUEST METHOD ###
# Check here before using HTTP/1.1: Method Definitions #
#RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK|OPTIONS|HEAD)$ [NC]
#RewriteRule ^.*$ - [F]
### Deny Fake Bots ###
BrowserMatch "^Java/?[1-9_\.]*" bad_bot
BrowserMatch "^MJ12bot/?[1-9_\.]*" bad_bot
SetEnvIfNoCase User-Agent "^8484 Boston Project/?[1-9_\.]*" bad_bot
SetEnvIfNoCase User-Agent "charlotte/" bad_bot
SetEnvIfNoCase User-Agent "curl/7.15.5 (i686-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5" bad_bot
SetEnvifNoCase User-Agent "^Heritrix/" bad_bot
SetEnvIfNoCase User-Agent "ia_archiver" bad_bot
SetEnvIfNoCase User-Agent "larbin/" bad-bot
SetEnvIfNoCase User-Agent "libwww-perl"" bad_bot
SetEnvIfNoCase User-Agent "^libcurl-agent/" bad_bot
SetEnvifNoCase User-Agent "IRC-Bbot" bad_bot
SetEnvifNoCase User-Agent "ISC Systems iRc Search 2.1" bad_bot
SetEnvIfNoCase User-Agent "^Jakarta\ Commons-HttpClient/" bad_bot
SetEnvIfNoCase User-Agent "^Java/" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft\ URL\ Control.*$" bad_bot
SetEnvIfNoCase User-Agent "^MJ12bot/" bad_bot
SetEnvIfNoCase User-Agent "MJ12bot/v1.0.8" bad_bot
SetEnvIfNoCase User-Agent "^Missigua Locator" bad_bot
SetEnvIfNoCase User-Agent "^Mozilla/4\.0\ .*Win\ 9x\ 4\.90.*$" bad_bot
SetEnvIfNoCase User-Agent "Nutch" bad_bot
SetEnvIfNoCase User-Agent "^PEAR HTTP_Request class" bad_bot
SetEnvIfNoCase User-Agent "phpversion" bad_bot
SetEnvIfNoCase User-Agent "^psycheclone" bad_bot
SetEnvIfNoCase User-Agent "^TencentTraveler" bad_bot
SetEnvIfNoCase User-Agent "^Web Downloader" bad_bot
SetEnvIfNoCase User-Agent "^Wells Search II" bad_bot
SetEnvIfNoCase User-Agent "^WEP Search 00" bad_bot
<FilesMatch "(.*)">
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</FilesMatch>
RewriteEngine on
RewriteBase /
# Known Bad Bots
RewriteCond %{HTTP_USER_AGENT} ADSARobot|ah-ha|almaden|aktuelles|Anarchie|amzn_assoc|Arachmo|ASPSeek|ASSORT|ATHENS|Atomz|attach|attache|autoemailspider|BackWeb|Bandit|BatchFTP|bdfetch|Bbot|BecomeBot|big.brother|Bitacle|BlackWidow|bmclient|boitho.com-dc|Boston\ Project|bot/1.0|BravoBrian\ SpiderEngine\ MarcoPolo|Bot\ mailto:craftbot@yahoo.com|Buddy|Bullseye|bumblebee|capture|CherryPicker|ChinaClaw|CICC|clipping|Clushbot|Collector|Copier|Crescent|Crescent\ Internet\ ToolPak|Custo|cyberalert|Deweb|diagem|Digger|Digimarc|DIIbot|DISCo|DISCo\ Pump|DISCoFinder|Download\ Demon|Download\ Wonder|Downloader|Drip|DSurf15a|DTS.Agent|EasyDL|eCatch|ecollector|efp@gmx\.net|Email\ Extractor|EirGrabber|email|EmailCollector|EmailSiphon|EmailWolf|Express\ WebPictures|ExtractorPro|EyeNetIE|FavOrg|fastlwspider|Favorites\ Sweeper|Fetch|FEZhead|FileHound|FlashGet\ WebWasher|FlickBot|fluffy|FrontPage|GalaxyBot|Generic|Getleft|GetRight|GetSmart|GetWeb!|GetWebPage|gigabaz|Girafabot|Go\!Zilla|Go!Zilla|Go-Ahead-Got-It|GornKer|gotit|Grabber|GrabNet|Grafula|Green\ Research|grub-client|Harvest|heritrix|hhjhj@yahoo|hloader|HMView|HomePageSearch|http\ generic|HTTrack|httpdown|httrack|ia_archiver|IBM_Planetwide|Image\ Stripper|Image\ Sucker|imagefetch|IncyWincy|Indy*Library|Indy\ Library|informant|Ingelin|InterGET|Internet\ Ninja|InternetLinkagent|Internet\ Ninja|InternetSeer\.com|Iria|Irvine|JBH*agent|JetCar|JOC|JOC\ Web\ Spider|JustView|kalooga|KWebGet|Lachesis|larbin|Leacher|LeechFTP|LexiBot|lftp|likse|Link|Link*Sleuth|LINKS\ ARoMATIZED|LinkWalker|LWP|lwp-trivial|Mag-Net|Magnet|Mac\ Finder|Mag-Net|Mass\ Downloader|MCspider|MJ12bot/v1\.0\.8|Memo|Microsoft.URL|MIDown\ tool|Mirror|Missigua\ Locator|Mister\ PiX|MMMtoCrawl\/UrlDispatcherLLL|monit|^Mozilla$|Mozilla.*Indy|Mozilla.*NEWT|Mozilla*MSIECrawler|MS\ FrontPage*|MSFrontPage|MSIECrawler|MSProxy|MSR-ISRCCrawler|multithreaddb|my-heritrix-crawler|nationaldirectory|Navroad|NearSite|NetAnts|NetCarta|NetMechanic|netprospector|NetResearchServer|NetSpider|Net\ Vampire|NetZIP|NetZip\ Downloader|NetZippy|NEWT|NICErsPRO|Ninja|NPBot|NicheBot|noxtrumbot|Octopus|Offline\ Explorer|Offline\ Navigator|OmniExplorer|OpaL|Openfind|OpenTextSiteCrawler|OrangeBot|PageGrabber|Papa\ Foto|PackRat|pavuk|pcBrowser|PersonaPilot|Ping|PingALink|Pingdom|Pockey|POE-Component-Client-HTTP|Powermarks|Proxy|psbot|PSurf|psycheclone|puf|Pump|PushSite|QRVA|RealDownload|Reaper|Recorder|ReGet|replacer|RepoMonkey|Robozilla|Rover|RPT-HTTPClient|Rsync|Scooter|SearchExpress|searchhippo|searchterms\.it|Second\ Street\ Research|Seeker|Shai|Siphon|sitecheck|sitecheck.internetseer.com|SiteSnagger|SlySearch|SmartDownload|snagger|Snake|SpaceBison|Spegla|SpiderBot|sproose|SqWorm|Stripper|Sucker|SuperBot|SuperHTTP|Surfbot|SurfWalker|Szukacz|tAkeOut|tarspider|Teleport\ Pro|Templeton|TencentTraveler|TrueRobot|TV33_Mercator|UIowaCrawler|UtilMind|URLSpiderPro|URL_Spider_Pro|Vacuum|vagabondo|vayala|visibilitygap|VoidEYE|vspider|Web\ Downloader|w3mir|Web\ Data\ Extractor|Web\ Image\ Collector|Web\ Sucker|Wweb|WebAuto|WebBandit|web\.by\.mail|Webclipping|webcollage|webcollector|WebCopier|webcraft@bea|webdevil|webdownloader|Webdup|WebEMailExtrac|WebFetch|WebGo\ IS|WebHook|Webinator|WebLeacher|WEBMASTERS|WebMiner|WebMirror|webmole|WebReaper|WebSauger|Website|Website\ eXtractor|Website\ Quester|WebSnake|Webster|WebStripper|websucker|webvac|webwalk|webweasel|WebWhacker|WebZIP|Wget|Whacker|whizbang|WhosTalking|Widow|WinHTTP|WISEbot|WWWOFFLE|x-Tractor|^Xaldon\ WebSpider|WUMPUS|Xenu|XGET|Yeti|zermelo|Zeus.*Webster|Zeus [NC]
RewriteRule ^.* - [F,L]
# Bots starting with Web
RewriteCond %{HTTP_USER_AGENT} ^web(zip|emaile|enhancer|fetch|go.?is|auto|bandit|clip|copier|master|reaper|sauger|site.?quester|whack) [NC,OR]
# Anywhere in UA -- Greedy REGEX
RewriteCond %{HTTP_USER_AGENT} ^.*(craftbot|download|extract|stripper|sucker|ninja|clshttp|webspider|leacher|collector|grabber|webpictures).*$ [NC]
RewriteRule ^.* - [F,L]