View Single Post
  #4 (permalink)  
Old 05-12-2008, 05:28 PM
wige's Avatar
wige wige is offline
Moderator
WebProWorld Moderator
 
Join Date: Jun 2006
Location: United States
Posts: 2,629
wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9
Default Re: Deluge of the bots

Yeah, that is what I am looking to do, but I need to grab the most suspect IP addresses from the database, and I am unsure what tolerances I should use.

Right now, the query SELECT IP, COUNT(DISTINCT useragent) AS Agents, COUNT(DISTINCT productID) as Products FROM pageviews GROUP BY IP ORDER BY Agents DESC; gives me the following table:

117.102.128.221 9 1
77.127.155.176 9 1
125.204.58.200 8 1
190.50.125.136 8 1
216.198.139.38 8 1
220.95.108.239 8 1
66.63.219.212 8 1
87.205.215.237 8 1
89.3.18.208 8 1
79.177.161.75 7 1
85.180.253.111 7 1

The first number is the number of unique user agents, and the second is the number of products they viewed. Note: it is not unusual for legitimate users to view the same product several times due to users re-viewing the page with AJAX. Also, if a user upgrades their browser, or has multiple computers on a home network, they could have more than one user agent logged per IP, and I would not want an automated process to block legitimate traffic as a result of that.

Obviously, I think using nine different user agents to view a single product is highly suspect, but I am not sure where I should draw the line, and if there are other aspects I should be considering before deciding which IPs to block.
__________________
The best way to learn anything, is to question everything.

Last edited by wige; 05-12-2008 at 05:32 PM.
Reply With Quote