jonesethan
08-17-2007, 02:06 AM
These are essentially steps how Google hack works:
1. All is well; your website is XYZ Consulting (http://www.xyz.com) and it is currently listed in the top 10 in Google for 'xyz'.
2. A hacker comes along and decides that your listing for 'xyz' needs to be removed (perhaps for competitive reasons or out of spite). So the hacker gets Google to spider your website through a proxy. The address that Google would be given to index might look like this:
www.proxysrus.au/proxy/www.xyzname.com/
3. When Google indexes this new URL it looks legitimate and Google's filters will soon recognize that the content being indexed is exactly the same as XYZ Consulting (http://www.xyz.com). As a result, in the cases that have arisen so far XYZ Consulting (http://www.xyz.com). (http://www.xyzname.com) loses its ranking and the freshly indexed proxy URL has effectively eradicated the competition.
So how is this hack technically accomplished?
Well I am with Dan Thies on this one, I have no interest at all in sharing the specifics because the last thing I want to do is enable more evil in the world. In addition, since I have never seen such a thing done I can only postulate how it would be accomplished. That said, I think it is reasonable to share the problems that would need to be surmounted to make such a thing work:
1. When the proxy URL is requested the server would have to provide the search engines with zero suspicion that a proxy was delivering the information. This includes URL syntax, URL length, server header information and latency.
2. The proxy URL would have to appear authoritative.
3. The proxy would need to be able to thwart the proxy hacking prevention measures that Dan has laid out within his informative article. At this time it appears the only proxies that are having some success are the ones that strip all browsing information so that the 'hack proof' sites cannot tell whether the traffic is legitimate or not. If they can't tell then they will not know to block Google from spidering their site through the wrong URL. Again even that attack has been rebuffed by Dan and his team by enabling noindex and nofollow tags on his client's sites UNLESS a verified search engine is visiting.
1. All is well; your website is XYZ Consulting (http://www.xyz.com) and it is currently listed in the top 10 in Google for 'xyz'.
2. A hacker comes along and decides that your listing for 'xyz' needs to be removed (perhaps for competitive reasons or out of spite). So the hacker gets Google to spider your website through a proxy. The address that Google would be given to index might look like this:
www.proxysrus.au/proxy/www.xyzname.com/
3. When Google indexes this new URL it looks legitimate and Google's filters will soon recognize that the content being indexed is exactly the same as XYZ Consulting (http://www.xyz.com). As a result, in the cases that have arisen so far XYZ Consulting (http://www.xyz.com). (http://www.xyzname.com) loses its ranking and the freshly indexed proxy URL has effectively eradicated the competition.
So how is this hack technically accomplished?
Well I am with Dan Thies on this one, I have no interest at all in sharing the specifics because the last thing I want to do is enable more evil in the world. In addition, since I have never seen such a thing done I can only postulate how it would be accomplished. That said, I think it is reasonable to share the problems that would need to be surmounted to make such a thing work:
1. When the proxy URL is requested the server would have to provide the search engines with zero suspicion that a proxy was delivering the information. This includes URL syntax, URL length, server header information and latency.
2. The proxy URL would have to appear authoritative.
3. The proxy would need to be able to thwart the proxy hacking prevention measures that Dan has laid out within his informative article. At this time it appears the only proxies that are having some success are the ones that strip all browsing information so that the 'hack proof' sites cannot tell whether the traffic is legitimate or not. If they can't tell then they will not know to block Google from spidering their site through the wrong URL. Again even that attack has been rebuffed by Dan and his team by enabling noindex and nofollow tags on his client's sites UNLESS a verified search engine is visiting.