View Full Version : 302 redirects
chandrika
03-18-2008, 12:58 PM
Looking at my website stats I am seing something odd.
I have no 302 redirects set up on my website at all and yet in my stats I am seeing 31586 hits 77 % being 302 temp redirects.
I am feeling very uneasy about a few things I have spotted in the SE recently, as someone is using some script that seems to be scraping my site.
This can be seen in the results here
php? www.findallsorts.com - Google Search (http://www.google.co.uk/search?q=php%3F+www.findallsorts.com&hl=en&start=10&sa=N)
Where a couple of sites have some script thing going on involving my website, the listed pages they have do not actually exist, just seem to be doing something from my site findallsorts.com.
Would this explain all the 302s in my stats and does anyone have any advice on this.
Taking a look at the site, the first thing I noticed is that your site does not do proper input validation, as can be demonstrated through following the following exploit code:
2 Power (http://www.findallsorts.com/shopUK/search.php?q=2+Power+%3C/title%3E%3Cscript%3Ealert%28%27This%20is%20bad%21% 27%29%3C/script%3E)
That will cause a Javascript alert box to be displayed, which is a very bad thing.
Second, I noticed that any time a product is not found, either because the page no longer exists or because an incorrect product is entered, the visitor is redirected to a search page (that is the page exploited above). This redirection is done through a 302 temporary redirect.
chandrika
03-18-2008, 01:50 PM
Wow, that IS bad!
Is there anyway I can fix that, what have I done wrong to allow such things?
Because products change so frequently and sometimes get spidered and then are gone, I did a 404.shtml page, and so that is the page that shows for 404.
Is there a better way to get my 404 page to show for pages that have gone? There are too many to do individual 301s.
I think I need to really do some work on security of the site. Its a work in progress, I am only still setting it all up, I probably shouldnt have let it be spidered till I had it finished, due to frequent changes going on.
chandrika
03-18-2008, 01:52 PM
Is this what you refer too
Input Validation (http://tapestry.apache.org/tapestry4/UsersGuide/validation.html)
Wow, that IS bad!
Is there anyway I can fix that, what have I done wrong to allow such things?
The site seems to be based on a CMS system, which has not secured all of the fields properly. The field in question is the title field of the search.php script, which will need to be edited to remove improper characters before generating the title.
Because products change so frequently and sometimes get spidered and then are gone, I did a 404.shtml page, and so that is the page that shows for 404.
Is there a better way to get my 404 page to show for pages that have gone? There are too many to do individual 301s.
Again, I think this is a problem in the underlying CMS system. If I enter a random URL, I get the 404.shtml page that you created. However, if I go to a product page, then change the product name in the URL, I am sent to the search function provided as part of the CMS. This is where the 302 redirects are occurring. I think your best option would be contacting whoever built the CMS system to get these issues resolved. It should be a trivial matter for them to update the system to be more SEO friendly with the proper redirects, and fix the XSS hole.
chandrika
03-18-2008, 02:29 PM
Thanks, I will contact the guy who wrote the scripts, I really appreciate your having pointed this out to me.
jimmyebaker
03-19-2008, 11:47 PM
Hi Chandrika.
I'm looking over your site now and I see something that immediately stands out.
You have two drop downs on your page that contain <option> tags with links in them. These links are missing a trailing '/' character. This missing character will cause your webserver to automatically do a 301 redirect to the correct link.
This is an example of what you have:
http://www.findallsorts.com/shopUSA/c/accessoriesbut it should actually be
http://www.findallsorts.com/shopUSA/c/accessories/All web servers will respond to this with a 301 redirect back to the url with the '/' on the end of it. This is something that is commonly overlooked and sometimes costly. I'd start by correcting this and see if your skewed data clears up.
Here's a useful SEO tool that will show you the response header that the servers are sending back to their clients:
SEO Tools - HTTP Head Request Viewer - 301 Redirect Test (http://www.metamend.com/seo-tools/http-header-request-viewer.html)
Example using this tool:
http://findallsorts.com/shopUSA/c/accessories --> 301 Moved Permanently
http://findallsorts.com/shopUSA/c/accessories/ --> 200 OK
Connection: close
Date: Thu, 20 Mar 2008 04:40:36 GMT
Server: Apache/1.3.37 (Unix) mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 PHP/4.4.7 FrontPage/5.0.2.2635.SR1.2 mod_ssl/2.8.28 OpenSSL/0.9.7a
Content-Type: text/html;charset=utf-8
Client-Date: Thu, 20 Mar 2008 04:40:42 GMT
Client-Peer: 75.127.68.78:80
Client-Response-Num: 1
X-Powered-By: PHP/4.4.7
Good luck. ;)
chandrika
03-20-2008, 03:46 AM
That is very interesting, I did not realise that, as from browsing I had always got where I wanted to go without the /, it had not occured to me that it was necessary part of the address. Thanks for that!
I have fixed the problem with the javascript box, it was exactly as Wige said, and was quite easy to put right the input validation that was missing in some places.
Apparently the pages I am seeing in search results that are scraper sites whose pages dont actually exist, are derived from search results, and as Google indexes sitemap files, that is where the content related to my site has come from.
Unfortunately it's not possible to see what other content is on the pages because it is only presented to Googlebot - everyone else gets permission denied. But they seem to be redirects that show Google my site content and then hop straight to a stores website via their affiliate link.
I have made a few server changes so that IPs that make too many connections get blocked, hopefully that will stop some of it.
Thanks again,
chandrika
03-21-2008, 07:30 AM
Second, I noticed that any time a product is not found, either because the page no longer exists or because an incorrect product is entered, the visitor is redirected to a search page (that is the page exploited above). This redirection is done through a 302 temporary redirect.
Would there be any benefit to using a 301 redirect for that function, rather than a 302?
Pagerank typically does not flow through 302 redirects. As a result, if you gain editorial links to a page that goes away, using a 302 redirect would cause all that link juice to go away. A 301 would maintain the link juice, and allow some page rank to flow to the products listed on the not-found search page.
As far as I can tell, it is not known if 301 redirects transfer the full weight of the link compared to a direct link. Personally, I think it is probable because these redirects serve so many legitimate purposes. However, I have seen documentation from Google indicating that there is no flow of PR through the 302 redirects.
chandrika
03-21-2008, 10:55 AM
Thanks Wige, as you say Google make a point about 302s, I think I will change them to 301s, which is also more what they are anyway.