PDA

View Full Version : Web Pages receive a 404 error code



Flyinjs
11-15-2010, 04:07 PM
Hello,

Looking for some help with a CMS built web site. The "Index.php" page is the only one that all of the big three search engines pick up.

The other 36 pages are called up from a data base but for some reason they are getting crawler 404 errors?

The pages are there when I pull up the home page. Can anyone please tell me what the problem could be?

Jean-Luc
11-15-2010, 05:04 PM
Hello,

It would be easier to help with one of these URL's that return a 404 to crawlers.

Jean-Luc

Lorel509
11-17-2010, 08:43 PM
Do the URLs contain capitals like you wrote out for Index.html? If so that might be the problem.

Flyinjs
11-18-2010, 09:18 PM
Jean-Luc,

here is one of the web pages that come 404 wwwendlessmedscom/vet.html I tried the broken link checker that you provided and it still comes up errors when the darn pages are really there???

sorry it will not let me send a URL so I just took out the dots

No, the url's do not contain capitals. I looked your site over and was impressed!

wige
11-22-2010, 11:57 AM
My guess is that the server has a protection mode that blocks certain user agents, responding with an Error404 message when the user agent matches something on the blacklist (or, the system may use a whitelist). My advice if this is a shared host would be to contact the hosting company and have them check the system to see what bots are being blocked.

williamc
11-22-2010, 12:02 PM
It is saying they are 404's because, surprise, they ARE 404's:

[root@delta1 ~]# lynx -head 'http://www.endlessmeds.com/vet.html'
HTTP/1.1 404 Not Found
Date: Mon, 22 Nov 2010 17:01:06 GMT
Content-Type: text/html
Connection: close
Server: Apache/Nginx/Varnish
Content-Length: 35798

williamc
11-22-2010, 12:05 PM
Even assuming it is an issue with UA, I tried:

lynx -head 'http://www.endlessmeds.com/vet.html' -useragent='Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1b2) Gecko/20081201 Firefox/3.1b2'

as well as a IE7 UA with the same results. They are blocking something, but I don't see what. Time for a new hosting company or a new CMS.