iEntry 10th Anniversary Forum Rules Search
WebProWorld
Register FAQ Calendar Mark Forums Read
Google Discussion Forum Google Discussion forum is for topics specifically related to Google. There is a subforum dedicated to AdSense/AdWords subjects.

Share Thread: & Tags

Share Thread:

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 11-17-2004, 04:14 PM
WebProWorld New Member
 
Join Date: Nov 2004
Location: San Diego
Posts: 2
supremeanime RepRank 0
Default Google not spidering HTML catalog

My site is in PHP but we have a html catalog. It seems that Google only caches the php pages rather than going through the catalog.

I have spent a lot of time validating the HTML and feel that we are solid there.

When visitors first hit my site they are in the HTML catalog until they hit "cart" or "search" which switches them to php pages.

Any feedback or advice why Google avoids my HTML catalog?

Thanks!
Reply With Quote
  #2 (permalink)  
Old 11-17-2004, 08:24 PM
cbp cbp is offline
WebProWorld 1,000+ Club
 
Join Date: Oct 2003
Posts: 4,938
cbp RepRank 1
Default

Could be:
1) Duplicate content filter/penalty?
2) Not enough PR for Google or crawl that deep into site

CBP
Reply With Quote
  #3 (permalink)  
Old 11-18-2004, 01:17 PM
WebProWorld Member
 
Join Date: Feb 2004
Location: NY / AZ
Posts: 31
Tish RepRank 0
Default hmm

Looks like the scripts up top could be the problem. The spider will sometimes only go so far down before giving up when all it finds is scripting. Those are really long and take up many many characters worth of data before any HTML appears.
Reply With Quote
  #4 (permalink)  
Old 11-18-2004, 02:25 PM
WebProWorld Member
 
Join Date: Jul 2004
Posts: 49
strum4life RepRank 0
Default Deep Scanning

If you don't have enough link popularity, Google will only crawl so deep into your site. Try adding a link to your catalog from the home page.
__________________
Fantasy Blitz
Reply With Quote
  #5 (permalink)  
Old 11-18-2004, 07:06 PM
WebProWorld Member
 
Join Date: Oct 2003
Location: St. Louis
Posts: 30
lutenegger RepRank 0
Default

Obviously link popularity is hurting you, google doesn't have one site that links to your homepage indexed, which would explain your 0 Pagerank value. Relative link importance is big for google.

Also your homepage is basically a series of links it lacks a lot of real content. Generally not a plus with any search engine. Though it has less to do with G's algorythm. My guess is that google will only read so many characters before stopping. As suggested previously I would put the link to your catalog high in the code and drop your javascript into an include, its a good idea from a site management standpoint as well.

Also, google sees your pages as largely similar only 13 of 309 show as distinct. It doesn't help it thinks you have largely duplicate content.

On an unrelated note: you may want to make your search a little smarter, if possible. I just did a quick search for Inu-Yasha got nothing, changed it to InuYasha got one thing, only after drilling down was I able to find that they were labeled Inu Yasha.

I like the site, I'll refer one of my friends who's a big fan of IY to you. I just think it needs some tweaking to get to where you want to be, good luck.
Reply With Quote
  #6 (permalink)  
Old 11-18-2004, 07:26 PM
anablake's Avatar
WebProWorld Member
 
Join Date: Jun 2004
Location: FtLaud, FL
Posts: 42
anablake RepRank 0
Default

hi supremeanime,

we ran into the same problem with our old shop layout. there was tons of javascript in the homepage (as well as frames layout, but we won't go there :) )

we pulled all the scripts out into .js files and only placed the reference for them into the pages. it worked. it seems that the spiders get exhausted on those scripts and just say bye bye.

have you checked your web logs to see where they are actually stopping the spider of your site? this could also give you a clue to the problem.

we have an html catalogue of our site as well as the php and google is spidering it on a regular basis. it is linked on every page from our company name. it now has page rank as well (amen.)

best of luck to you :)
Reply With Quote
  #7 (permalink)  
Old 11-19-2004, 03:58 PM
WebProWorld New Member
 
Join Date: Sep 2004
Location: At my desk
Posts: 16
PHPfan RepRank 0
Default Re: Google not spidering HTML catalog

Quote:
Originally Posted by supremeanime
My site is in PHP but we have a html catalog. It seems that Google only caches the php pages rather than going through the catalog.
All web pages are HTML. PHP is just a means for generating HTML dynamically.
Reply With Quote
  #8 (permalink)  
Old 11-19-2004, 09:09 PM
edhan's Avatar
WebProWorld Veteran
 
Join Date: Aug 2003
Location: Singapore
Posts: 716
edhan RepRank 3edhan RepRank 3edhan RepRank 3
Default Error on Special Deals

Hi

I tried clicking on Special Deals and got this error:
Warning: Cannot modify header information - headers already sent by (output started at /home/virtual/site1/fst/var/www/html/cart/specials.php:5) in /home/virtual/site1/fst/var/www/html/cart/referer.php on line 56

Warning: Cannot modify header information - headers already sent by (output started at /home/virtual/site1/fst/var/www/html/cart/specials.php:5) in /home/virtual/site1/fst/var/www/html/cart/include/get_language.php on line 86

Warning: Cannot modify header information - headers already sent by (output started at /home/virtual/site1/fst/var/www/html/cart/specials.php:5) in /home/virtual/site1/fst/var/www/html/cart/include/get_language.php on line 87

It might be due to these errors causing the spider unable to index?

Edward
Reply With Quote
  #9 (permalink)  
Old 11-24-2004, 04:33 PM
WebProWorld New Member
 
Join Date: Nov 2004
Location: San Diego
Posts: 2
supremeanime RepRank 0
Default

Thanks for all the feedback! I know now what I need to work on.

Do you ever really finish building a website?
Reply With Quote
Reply

  WebProWorld > Search Engines > Google Discussion Forum

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -4. The time now is 03:00 PM.



Search Engine Optimization by vBSEO 3.3.0