iEntry 10th Anniversary Forum Rules Search
WebProWorld
Register FAQ Calendar Mark Forums Read
Google Discussion Forum Google Discussion forum is for topics specifically related to Google. There is a subforum dedicated to AdSense/AdWords subjects.

Share Thread: & Tags

Share Thread:

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 05-18-2007, 02:44 PM
Oman's Avatar
WebProWorld Member
 
Join Date: Aug 2003
Location: http://www.myfootshop.com
Posts: 74
Oman RepRank 1
Default The spider trap - tighter with recent Google updates?

Our web site uses .asp to generate thousands of dynamic pages. Due to the inability of spiders to read our dynamically generated pages, we've relied upon a third part product called XQASP. XQASP re-writes the query string to a URL that is searchable by spiders. Google Answers has advocated the use of XQASP back in '03 as a reliable tool that can be used to enable Google spiders to read and index dynamically generated pages.

Two weeks ago we found two things that happened simultaneously; 1) a PR drop on our site and 2) in Google's webmaster tools, diagnostic section, we now have over 2000 unreachable URL's. The reason for the unreachable URL's is listed as a 505 - internal server error.

Interestingly, none of the unreachable URL's listed by Google are to be found in our website's site map. Some, but not all of the unreachable URL's show the re-written query strings generated by XQASP, but not all.

My programmer feels that all of these unreachable URL's are backlinks. I tend to disagree. I've spoke to XDE, the makers of XQASP who have give a couple of suggestions for re-writting our code, but nothing specific to Google updates.

My question is this; have recent Google algorithm changes perhaps made XQASP an obsolete product by tightening the spider traps?

Jeff
Reply With Quote
  #2 (permalink)  
Old 05-18-2007, 05:38 PM
incrediblehelp's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Jan 2004
Location: Live in Cincy Now
Posts: 7,573
incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4
Default

Can you post some examples of unreachable URLs?
Reply With Quote
  #3 (permalink)  
Old 05-18-2007, 06:09 PM
Oman's Avatar
WebProWorld Member
 
Join Date: Aug 2003
Location: http://www.myfootshop.com
Posts: 74
Oman RepRank 1
Default

Certainly. The first three are URL's that resulted in 404 errors. Interestingly, when an order ID is listed, it's a -0-.

http://www.myfootshop.com/xq/ASP/Con...w1.asp?order=0

http://www.myfootshop.com/xq/ASP/Met...w_to_order.htm

http://www.myfootshop.com/xq/ASP/Met...w1.asp?order=0

The next three are what Google calls unreachable URL's that resulted in a 500 internal server error.

http://www.myfootshop.com/searchresu...iabetes%20Care

http://www.myfootshop.com/detail.asp...ion=Ganglionic

http://www.myfootshop.com/searchresu...e=Biomechanics

Jeff
Reply With Quote
  #4 (permalink)  
Old 05-18-2007, 06:22 PM
incrediblehelp's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Jan 2004
Location: Live in Cincy Now
Posts: 7,573
incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4
Default

Quote:
Originally Posted by Oman
So do this URLs exist? Are they or do you want them to exist?

Remember other websites can link to your website using any number of "false" URLs and the spider will try to reach them at your website. Dumb on the other websites part, but doesn't mean anything is necessarily wrong with your website. What you have to make sure is that your website is not producing these "false" URLs anywhere.

I see 500 errors from some larger websites I monitor here and they in Google Webmaster Console, yet the pages work fine when I check them. Could be a momentary outage from the host?
Reply With Quote
  #5 (permalink)  
Old 05-18-2007, 06:49 PM
WebProWorld New Member
 
Join Date: Oct 2005
Posts: 17
lpoulsen RepRank 0
Default You need to run a link checker

The URLs that you are listing look like it is not just Google reporting them broken - they really ARE broken.
I ran the Xenu link checker against your site, and it reports MANY broken links. There seem to be two classes of these:

- some are links with URLs that are not actually working. This would indicate bugs in the scripts that generate the pages.

- many are reported by the link checker as "no connection". Some of these recover when I run a second pass to "retry broken links". These may indicate capacity problems in the server: When the crawler fires off back-to-back requests, some of them may get lost.

- finally, you have a few dozen external links to URLs that seem not to be working anymore.

I heartily recommend running a link checker on your site every week.

For xenu, see http://www.snafu.de/~tilman/#xenu
Reply With Quote
  #6 (permalink)  
Old 05-18-2007, 06:52 PM
WebProWorld New Member
 
Join Date: Oct 2005
Posts: 17
lpoulsen RepRank 0
Default

Oops, this was the URL that I meant to post for the (free as in beer) XENU link checker:
http://home.snafu.de/tilman/xenulink.html
Reply With Quote
  #7 (permalink)  
Old 05-18-2007, 06:55 PM
deepsand's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2004
Location: Philadelphia, PA
Posts: 3,226
deepsand RepRank 9deepsand RepRank 9deepsand RepRank 9deepsand RepRank 9deepsand RepRank 9deepsand RepRank 9deepsand RepRank 9deepsand RepRank 9deepsand RepRank 9deepsand RepRank 9deepsand RepRank 9
Default

Quote:
Originally Posted by Oman
Certainly. The first three are URL's that resulted in 404 errors. Interestingly, when an order ID is listed, it's a -0-.

http://www.myfootshop.com/xq/ASP/Con...w1.asp?order=0

http://www.myfootshop.com/xq/ASP/Met...w_to_order.htm

http://www.myfootshop.com/xq/ASP/Met...w1.asp?order=0

The next three are what Google calls unreachable URL's that resulted in a 500 internal server error.

http://www.myfootshop.com/searchresu...iabetes%20Care

http://www.myfootshop.com/detail.asp...ion=Ganglionic

http://www.myfootshop.com/searchresu...e=Biomechanics

Jeff
All 6 are indeed unreachable.

The 1st 3 yield a custom 404 page.

Of the 2nd 3, the 1st and 3 rd yield a default 500.100 error, reading

HTTP 500.100 - Internal server error: ASP error.
Internet Information Services

--------------------------------------------------------------------------------

Technical Information (for support personnel)

Error Type:
Microsoft JET Database Engine (0x80040E10)
No value given for one or more required parameters.
/searchresults_simple.asp, line 74

Browser Type:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)


Page:
GET /searchresults_simple.asp



while the 2nd yields a custom page, no error code given, reading


Sorry...
This page is currently unavailable.

This may have occurred because the product or article is currently being edited. It's also possible that you followed a badly formed URL address from another website.

Return to our Home Page and use the search box in the upper right corner of the page. When searching by keyword, input only the first few letters of the word or phrase you are looking for to return more results.


Obviously, the problem is not with Google.
Reply With Quote
  #8 (permalink)  
Old 05-22-2007, 09:23 PM
Oman's Avatar
WebProWorld Member
 
Join Date: Aug 2003
Location: http://www.myfootshop.com
Posts: 74
Oman RepRank 1
Default Re: The spider trap - tighter with recent Google updates?

Hi and thanks for the replies. Sounds like I have my work cut out for me.

I took a brief look at XENU. I originally picked up the fact that there were broken links in my web site by using Google Tools. Does XENU offer any advantages above and beyond Google Tools? And does XENU (fingers crossed) report the web address of a back link that is inactive or inaccurate?

Jeff
Reply With Quote
Reply

  WebProWorld > Search Engines > Google Discussion Forum

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -4. The time now is 08:26 AM.



Search Engine Optimization by vBSEO 3.3.0