Submit Your Article Forum Rules

Results 1 to 8 of 8

Thread: redirect MSIE 6 with htaccess

  1. #1
    Senior Member dgswilson's Avatar
    Join Date
    Jul 2009
    Location
    Texas
    Posts
    286

    redirect MSIE 6 with htaccess

    I'm experimenting with redirecting, as opposed to denying, for certain variables in user agent strings. I chose MSIE 6 because it's used in many (bad) bot strings.

    I've been watching logs and the below code works, but ... I'm not sure it ends the process after initial redirect (L)

    I'm going to look for a "test site with different user agents (browsers etc.) tool". So far I haven't found a tool so will keep looking.

    My question(s): Anyone see anything technically wrong with directives (below), and is there a "user agent" tool available so I can test directives myself?

    Thanks, Doug Wilson

    RewriteEngine on
    RewriteCond %{HTTP_USER_AGENT} MSIE\ 6\.0
    RewriteRule .* xxxx://domain dot com/logz/msie6.html [L,R=301]

  2. #2
    If you're trying to spot bad bots, you might use a more general RE

    RewriteCond %{HTTP_USER_AGENT} MSIE\s*6 [NC]

    with the NC to ignore case.

    I've found the perl package LWP::UserAgent useful in simulating browsers.
    You can change the agent name text string to whatever you want


    use LWP::UserAgent;
    ...
    sub getPage(URL)
    {
    my $URL = $_[0];

    my $UA = LWP::UserAgent->new;
    $UA->timeout(30); # times out after 30 seconds
    $UA->agent('Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; DigExt; .NET CLR 1.1.4322; .NET CLR 2.0.50727)');

    # Request the page
    my $Request = HTTP::Request->new('GET', $URL);
    my $Response = $UA->request($Request);

    if ($Response->is_success)
    {
    # Try to decode the page
    my $Page = $Response->decoded_content();
    if (!$Page) { $Page = $Response->content; }
    # Return the page
    return ( $Page, RC_OK);
    }
    else
    {
    # Return null contents and status
    return (undef, $Response->status_line);
    }
    }


    If you don't use perl then I guess that the same kind of facility is available in other languages - maybe php has this sort of thing?

    Hope this helps.
    There may be no such thing as a silly question but the world is littered with silly answers.
    Find some more at http://www.artsanddesigns.com/cgi-bin/makeBlog.pl

  3. #3
    WebProWorld MVP deepsand's Avatar
    Join Date
    May 2004
    Location
    State College, PA
    Posts
    16,481
    Quote Originally Posted by dgswilson View Post
    I'm going to look for a "test site with different user agents (browsers etc.) tool".
    Assuming that you're looking a for a 3rd party site that will make requests to your site as various User_Agents, try Rex Swain's HTTP Viewer. You can specify any User_Agent sting of your choosing.

  4. #4
    Senior Member NetProwler's Avatar
    Join Date
    Jan 2007
    Posts
    197
    Firefox has a User Agent Switcher extension available which will allow you test any user-agent of your choice. I am sorry to have stated the obvious. If you are looking for an external tool, as jammybiskit has said - Perl LWP module will be a worthy choice to cook up a quick and dirty solution.

  5. #5
    Senior Member dgswilson's Avatar
    Join Date
    Jul 2009
    Location
    Texas
    Posts
    286
    Thanks people,

    If you're trying to spot bad bots, you might use a more general RE
    Not so much spotting bots and other critter activity as finding ways to avoid blocking IP's via htaccess. You could explain this (s*6) for me.

    I did find the "Firefox user agent switcher" so got to test a bit. I'll look at Rex Swain (sounds familiar).

    I'll tell you what I found after a few days of viewing logs:

    Some bots with MSIE 6.0 in user agent string still got in. Some switched to MSIE 7 immediately. No real visitors got through so I toned it down and left deny for MSIE 4,3,2 etc.

    I've also been using and watching results for this:

    Deny from env=bad_bot

    SetEnvIfNoCase User-Agent ^-?$ bad_bot

    BrowserMatchNoCase Wget bad_bot
    BrowserMatchNoCase Curl bad_bot

    This seems to be working well and effects 0 real visitors. I'm also using some proxy and request directives which works for some bots. I can't use perl or php stuff yet. Just starting to understand htaccess. Since I'm asking questions, here are a couple more, which line after {HTTP_REFERER} is more dependable (better)?

    Code:
    RewriteCond %{HTTP_REFERER} domain\.com [NC,OR]
    RewriteCond %{HTTP_REFERER} ^xxxx://xxx\.crap\.com [NC]
    RewriteRule ^(.*)$ - [F]
    Question #2: I have a 500 error problem which occurs on clipbucket install. Error occurs only with user agents and only on /file/videos...

    I know this should be simple. I've been looking, off and on, for months and can't fix it. If anyone wants to take a crack at it (paid) email me.

    Thanks much for the responses, Doug

  6. #6
    Senior Member dgswilson's Avatar
    Join Date
    Jul 2009
    Location
    Texas
    Posts
    286
    So far the MSIE code is working great. I did have change the ua code a bit. Probably stops 80% of foragers

    Fixed the problem with Clipbucket.

  7. #7
    I always learn something from the replies - I'd forgotten about Rex Swain and didn't know about the FF extension - thanks!

    You could explain this (s*6) for me.
    the \s* is just a way of matching 0 or more spaces or tabs that a nasty bot might use - e.g. MSIE\s*6 would match MSIE6, MSIE 6, MSIE 6 and so on. The '6' is just a literal which will match 6 (but not anything else). To exactly match 6.0 you could use 6\.0 but this might stop the matches of 6 on its own. If so, you could use 6.* in place of 6.

    RewriteCond %{HTTP_REFERER} domain\.com [NC,OR] RewriteCond %{HTTP_REFERER} ^xxxx://xxx\.crap\.com [NC] RewriteRule ^(.*)$ - [F]
    I'm afraid I don't understand your question about which line is better. Both the RewriteCond lines you give may be tested due to the OR.
    There may be no such thing as a silly question but the world is littered with silly answers.
    Find some more at http://www.artsanddesigns.com/cgi-bin/makeBlog.pl

  8. #8
    Senior Member dgswilson's Avatar
    Join Date
    Jul 2009
    Location
    Texas
    Posts
    286
    jammybiskit - yeah, I forgot about rexswain too ...

    I changed things a little and it's catching 2 thru 6:
    MSIE\ ([23456])\. [NC]
    so if I wanted to include any added spaces I'd do this?
    MSIE\s* ([23456])\. [NC]
    Not positive it catches all (2d) and so on

    Now I need to figure out how to turn these away.
    Mozilla/0.6 Beta (Windows)
    Mozilla/0.91 Beta (Windows)
    Need something unique in there. Maybe "Beta", "(Windows)" or "Beta (Windows)" ?
    I'll have to try it out on in a specific folder and see what happens.

    Was asking about, starts with
    ^xxxx://xxx\.crap\.tld [NC]

    as opposed to
    crap\.tld [NC]

    I was thinking
    crap\.tld [NC]

    might include
    mycrap.crap\.tld [NC]

    without having to add
    mycrap\.crap\.tld
    bobscrap\.crap\.tld
    janescrap\.crap\.tld

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •