Re: Deluge of the bots
Yeah, that is what I am looking to do, but I need to grab the most suspect IP addresses from the database, and I am unsure what tolerances I should use.
Right now, the query SELECT IP, COUNT(DISTINCT useragent) AS Agents, COUNT(DISTINCT productID) as Products FROM pageviews GROUP BY IP ORDER BY Agents DESC; gives me the following table:
117.102.128.221 9 1
77.127.155.176 9 1
125.204.58.200 8 1
190.50.125.136 8 1
216.198.139.38 8 1
220.95.108.239 8 1
66.63.219.212 8 1
87.205.215.237 8 1
89.3.18.208 8 1
79.177.161.75 7 1
85.180.253.111 7 1
The first number is the number of unique user agents, and the second is the number of products they viewed. Note: it is not unusual for legitimate users to view the same product several times due to users re-viewing the page with AJAX. Also, if a user upgrades their browser, or has multiple computers on a home network, they could have more than one user agent logged per IP, and I would not want an automated process to block legitimate traffic as a result of that.
Obviously, I think using nine different user agents to view a single product is highly suspect, but I am not sure where I should draw the line, and if there are other aspects I should be considering before deciding which IPs to block.
__________________
The best way to learn anything, is to question everything.
Last edited by wige; 05-12-2008 at 05:32 PM.
|