PDA

View Full Version : Is somebody playing Black Hat SEO trick on our site



davidweb
03-14-2008, 10:00 AM
Hello all,

I have observed a very strange phenomenon on one of our site.

We were checking pages crawled by Google by using following command :-

site : ourwebsite . com

Everything looked fine, but we observed one link as :-

ourwebsite . com/webpage.asp?ref=s0d.org <--- CULPRIT

My questions are as below :-

a) How could google show this link when it is not present anywhere on our site ???

b) If this link doesnt exist then where did it came from ??

c) Is somebody trying to play some black hat seo trick here ??

d) Is there any way to avoid these issues in future


Please let me know how to resolve this issue :(

I am good at SEO ,but little weak in web development ;)

activeco
03-14-2008, 11:49 AM
d) Is there any way to avoid these issues in future

Be sure to return 404 for similar requests.

incrediblehelp
03-14-2008, 02:00 PM
1. The operator your using is not accurate, just shows a sampling.

2. I can link to your website anyway I want to and it may get indexed. Like Activeco says return a 404 and forget about such issues

davidweb
03-14-2008, 03:30 PM
1. The operator your using is not accurate, just shows a sampling.

2. I can link to your website anyway I want to and it may get indexed. Like Activeco says return a 404 and forget about such issues

Thanks for the reply. However we have observed this behaviour on several occasions.

Is there any way to stop these unknown reffered links

ourwebsite . com/webpage.asp?ref=s0d.org

Meawhile we have done 404 on this particular webpage.asp, but I would like to know if there is any programming/hosting technique to prevent these unknown external parameters.

Tech Manager
03-16-2008, 12:15 PM
Thanks for the reply. However we have observed this behaviour on several occasions.

Is there any way to stop these unknown reffered links

ourwebsite . com/webpage.asp?ref=s0d.org

Meawhile we have done 404 on this particular webpage.asp, but I would like to know if there is any programming/hosting technique to prevent these unknown external parameters.


You could go to elaborate programming lengths to prevent external parameters, but why bother. If the link is just a referrer it's not worth the effort. If it is a Cross-site scripting attack then it is an issue, ut as long as you are properly validating your variables for allowed content and disallowing everything else you should be fine.

activeco
03-16-2008, 01:40 PM
If the link is just a referrer it's not worth the effort.

The problem comes from the other side.
All such URI's actually come from Google's GWS server, which means you can feed Google (usually "google.com/url?g=...") with any url which accepts query strings and such a "valid" url can very easily produce duplicate/canonical issues.



...as long as you are properly validating your variables for allowed content and disallowing everything else you should be fine.

True, but 404 must be returned for all non valid requests.



Meawhile we have done 404 on this particular webpage.asp, but I would like to know if there is any programming/hosting technique to prevent these unknown external parameters.

You should go to IIS forum with that question (I assume you're on IIS).

For Apache 2.00+ something like this would return 404.php custom file (with 404 header in it) for ALL query strings:

RewriteCond %{QUERY_STRING} ^.+
RewriteRule ^(.*) 404.php [L]

However if you accept some get/post variables you should accept only those.

Tech Manager
03-16-2008, 03:42 PM
I'll buy that.

NetProwler
03-17-2008, 02:03 AM
As long as your site returns a 404 error to any contrived URL, I don't see any reason why you should worry about this. If you plan on to weed out the query string from your URL, you may shoot yourself in the foot when you have dynamic pages without "rewriting".

wige
03-17-2008, 10:02 AM
If these URLs are definitely not legitimate, why not turn them to your benefit? Create an action page that has a small error text (the link no longer exists, etc) and a strong call to action, and use a 301 redirect from the bad URL to the error page. This takes all the bad links, collects all the link juice being passed, and puts it into a single page on your site specifically designed to draw the passed traffic into your sites. Additionally, the redirect should cause the erroneous URLs to be dropped from Google's index rather quickly. 404s may remain in the index for six months or more.

activeco
03-17-2008, 10:45 AM
Additionally, the redirect should cause the erroneous URLs to be dropped from Google's index rather quickly. 404s may remain in the index for six months or more.
Wige, there is no sense in redirecting possible thousands of non-existent pages. You can correct a few erroneous links in that way, but it's not proper way to fight bots.
Besides no search engine indexes properly returned 404 code. You probably refer to previously indexed page that becomes non-existent, in which case you're right.


As long as your site returns a 404 error to any contrived URL, I don't see any reason why you should worry about this.

A server does not return 404 for 'wrong' query string. If someone is concerned about it (s)he may explicitly allow only accepted variables in an url.
See this example (http://www.bestnetcraft.com/contact.html?url=wpw.com&user=activeco) using your site's url.

I have used contact page, for the case something goes wrong.