View Full Version : The spider trap - tighter with recent Google updates?
Our web site uses .asp to generate thousands of dynamic pages. Due to the inability of spiders to read our dynamically generated pages, we've relied upon a third part product called XQASP. (http://xde.net/xq/tool.xqasp-deep-web/qx/index.htm) XQASP re-writes the query string to a URL that is searchable by spiders. Google Answers (https://answers.google.com/answers/threadview?id=196839#comments) has advocated the use of XQASP back in '03 as a reliable tool that can be used to enable Google spiders to read and index dynamically generated pages.
Two weeks ago we found two things that happened simultaneously; 1) a PR drop on our site and 2) in Google's webmaster tools, diagnostic section, we now have over 2000 unreachable URL's. The reason for the unreachable URL's is listed as a 505 - internal server error.
Interestingly, none of the unreachable URL's listed by Google are to be found in our website's site map. Some, but not all of the unreachable URL's show the re-written query strings generated by XQASP, but not all.
My programmer feels that all of these unreachable URL's are backlinks. I tend to disagree. I've spoke to XDE, the makers of XQASP who have give a couple of suggestions for re-writting our code, but nothing specific to Google updates.
My question is this; have recent Google algorithm changes perhaps made XQASP an obsolete product by tightening the spider traps?
Jeff
incrediblehelp
05-18-2007, 04:38 PM
Can you post some examples of unreachable URLs?
Certainly. The first three are URL's that resulted in 404 errors. Interestingly, when an order ID is listed, it's a -0-.
http://www.myfootshop.com/xq/ASP/Condition.Diabetic%20Foot%20Care/qx/associates/cgi-bin/view1.asp?order=0
http://www.myfootshop.com/xq/ASP/Method.Category/Value.Callus%20Care%20Products/qx/how_to_order.htm
http://www.myfootshop.com/xq/ASP/Method.Category/Value.Foot%20Odor/Sweaty%20Feet/qx/associates/cgi-bin/view1.asp?order=0
The next three are what Google calls unreachable URL's that resulted in a 500 internal server error.
http://www.myfootshop.com/searchresults_simple.asp?method=conditionsubcatego ry&value=Diabetes%20Care
http://www.myfootshop.com/detail.asp?condition=Ganglionic
http://www.myfootshop.com/searchresults.asp?method=conditionsubcategory&value=Biomechanics
Jeff
incrediblehelp
05-18-2007, 05:22 PM
The next three are what Google calls unreachable URL's that resulted in a 500 internal server error.
http://www.myfootshop.com/searchresults_simple.asp?method=conditionsubcatego ry&value=Diabetes%20Care
http://www.myfootshop.com/detail.asp?condition=Ganglionic
http://www.myfootshop.com/searchresults.asp?method=conditionsubcategory&value=Biomechanics
So do this URLs exist? Are they or do you want them to exist?
Remember other websites can link to your website using any number of "false" URLs and the spider will try to reach them at your website. Dumb on the other websites part, but doesn't mean anything is necessarily wrong with your website. What you have to make sure is that your website is not producing these "false" URLs anywhere.
I see 500 errors from some larger websites I monitor here and they in Google Webmaster Console, yet the pages work fine when I check them. Could be a momentary outage from the host?
lpoulsen
05-18-2007, 05:49 PM
The URLs that you are listing look like it is not just Google reporting them broken - they really ARE broken.
I ran the Xenu link checker against your site, and it reports MANY broken links. There seem to be two classes of these:
- some are links with URLs that are not actually working. This would indicate bugs in the scripts that generate the pages.
- many are reported by the link checker as "no connection". Some of these recover when I run a second pass to "retry broken links". These may indicate capacity problems in the server: When the crawler fires off back-to-back requests, some of them may get lost.
- finally, you have a few dozen external links to URLs that seem not to be working anymore.
I heartily recommend running a link checker on your site every week.
For xenu, see http://www.snafu.de/~tilman/#xenu
lpoulsen
05-18-2007, 05:52 PM
Oops, this was the URL that I meant to post for the (free as in beer) XENU link checker:
http://home.snafu.de/tilman/xenulink.html
deepsand
05-18-2007, 05:55 PM
Certainly. The first three are URL's that resulted in 404 errors. Interestingly, when an order ID is listed, it's a -0-.
http://www.myfootshop.com/xq/ASP/Condition.Diabetic%20Foot%20Care/qx/associates/cgi-bin/view1.asp?order=0
http://www.myfootshop.com/xq/ASP/Method.Category/Value.Callus%20Care%20Products/qx/how_to_order.htm
http://www.myfootshop.com/xq/ASP/Method.Category/Value.Foot%20Odor/Sweaty%20Feet/qx/associates/cgi-bin/view1.asp?order=0
The next three are what Google calls unreachable URL's that resulted in a 500 internal server error.
http://www.myfootshop.com/searchresults_simple.asp?method=conditionsubcatego ry&value=Diabetes%20Care
http://www.myfootshop.com/detail.asp?condition=Ganglionic
http://www.myfootshop.com/searchresults.asp?method=conditionsubcategory&value=Biomechanics
Jeff
All 6 are indeed unreachable.
The 1st 3 yield a custom 404 page.
Of the 2nd 3, the 1st and 3 rd yield a default 500.100 error, reading
HTTP 500.100 - Internal server error: ASP error.
Internet Information Services
--------------------------------------------------------------------------------
Technical Information (for support personnel)
Error Type:
Microsoft JET Database Engine (0x80040E10)
No value given for one or more required parameters.
/searchresults_simple.asp, line 74
Browser Type:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
Page:
GET /searchresults_simple.asp
while the 2nd yields a custom page, no error code given, reading
Sorry...
This page is currently unavailable.
This may have occurred because the product or article is currently being edited. It's also possible that you followed a badly formed URL address from another website.
Return to our Home Page and use the search box in the upper right corner of the page. When searching by keyword, input only the first few letters of the word or phrase you are looking for to return more results.
Obviously, the problem is not with Google.
Hi and thanks for the replies. Sounds like I have my work cut out for me.
I took a brief look at XENU. I originally picked up the fact that there were broken links in my web site by using Google Tools. Does XENU offer any advantages above and beyond Google Tools? And does XENU (fingers crossed) report the web address of a back link that is inactive or inaccurate?
Jeff