|
|
||||||
|
||||||
| Index Link To US Private Messages Archive FAQ RSS | ||||||
| Google Discussion Forum Google Discussion forum is for topics specifically related to Google. There is a subforum dedicated to AdSense/AdWords subjects. |
Share Thread: & Tags
|
||||
|
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
I went to uptimebot and entered my site and found that Google had only indexed only a couple hundred of my pages. The web site is a database of private schools and I have almost 30,000 pages on the site. The site has been up for almost 5 years. Can anyone tell me why the entire contents have not been indexed after so long?
__________________
"You miss 100% of the shots you never take" - Wayne Gretsky http://www.eschoolsearch.com |
|
||||
|
Travis, you've got a few issues here:
1) Your problem is at least partly being caused by invalid code. http://validator.w3.org/check?uri=ht...m%3Fstate%3DAL You got yourself a mess of bad code there, partner. And that's at the state level. Your city level is better, but still has some problems. http://validator.w3.org/check?uri=ht...doctype=Inline If I were to guess at any ONE thing (and I don't think ONE thing will solve it), I'd say it's your double-declaration of the <body> tag. Your detail pages are better again, from what I can tell, but still have some minor issues. The more you can solve, the easier it will be to crawl. http://validator.w3.org/check?uri=ht...doctype=Inline 2) If that doesn't work, try running a little Bruce Clay magic on the pages of your site. He doesn't seem to like your state pages: http://www.seotoolset.com/cgi-bin/kd...hEngine=Google You can run with it from there, I assume. If this doesn't solve things, you should be a lot closer and usually more things tend to reveal themselves.
__________________
Toronto Web Design | Search Engine Friendly, Standards-Compliant Layouts | Walk on my Path (my blog) |
|
|||
|
Quote:
Brian.
__________________
ToolBarn.com, an Internet Retailer Top 500 and Inc. 500 Company | Tool Parts | Pet Supplies |
|
|||
|
nagarjuna55334 wrote:
I think it can be possible by increasing the crawl time by writing the code in the robots.txt User-agent: googlebot Crawl-delay: 120 Are you saying that I have too long a delay on my robot.txt file? Maybe I should decrease it?
__________________
"You miss 100% of the shots you never take" - Wayne Gretsky http://www.eschoolsearch.com |
|
|||
|
ADAM Web Design wrote:
1) Your problem is at least partly being caused by invalid code. I see your point. However, I put in a URL from my site that has been indexed and it came up with the same number of errors. So I am at a loss to understand why the full site has not been indexed after 5 years. I can understand that my code makes it more difficult to index but surely 5 years is enough.
__________________
"You miss 100% of the shots you never take" - Wayne Gretsky http://www.eschoolsearch.com |
|
|||
|
Adam,
I am now more confused than ever. I went to the SEO tools site you recommended and did a server validation and used the server page tool and it said that I have No robots.txt file. Also, below that the only result that came back under all three was: Spider Input 1: Spider Input 2: Spider Input 3: Spider Input 4: <base href="http://www.eschoolsearch.com/"> Spider Input 5: Spider Input 6: Spider Input 7: Spider Input 8: Spider Input 9: Spider Input 10: Spider Input 11: Spider Input 12: Spider Input 13: Spider Input 14: Spider Input 15: Spider Input 16: Spider Input 17: Spider Input 18: Spider Input 19:
__________________
"You miss 100% of the shots you never take" - Wayne Gretsky http://www.eschoolsearch.com |
|
|||
|
This is very interesting. I don't mean to hijack my own topic but I went to that validator and put in the Yahoo home page and they have 285 errors on their home page!!!
__________________
"You miss 100% of the shots you never take" - Wayne Gretsky http://www.eschoolsearch.com |
|
||||
|
Quote:
You seem to misunderstand the concept of "Crawl-delay". It is used to increase the time between spidering one page and the next by introducing a delay - this is done to reduce load on the server: http://www.ilovejackdaniels.com/deve...ts-txt-file/3/ It has nothing to do with the time a bot takes to spider your pages. faglork |
|
|||
|
I went to Google direct and put in:
site:eschoolsearch.com esearchforit and 36,700 results came up instead of 440.
__________________
"You miss 100% of the shots you never take" - Wayne Gretsky http://www.eschoolsearch.com |
|
|||
|
It turns out that all of my pages are indexed but most of them are in the supplemental index because they are so similiar. I am not sure of the consequences of this.
__________________
"You miss 100% of the shots you never take" - Wayne Gretsky http://www.eschoolsearch.com |
|
|||
|
Invalid code is irrelevant to crawling, unless there is broken code that affects the links. For instance, I saw a page recently where the comment tag had been opened in the head, but it was never closed, so everything in the page was a comment.
I had a quick look at the site in your signature, and, unless I missed a link somewhere, I'm amazed that Google has indexed more than a dozen or so pages. They probably got most of the 200 from links pointing at them from other sites. Your problem is that most of the pages aren't crawlable. They are hidden behind forms, and spiders can't fill forms in. You need to make paths for spiders to follow, so they can reach all the pages. One way of doing it would be to add a directory as an alternative way for people to use the site. E.g. home page -> directory top (lists states) -> state pages (lists cities) -> city pages. Another way would be to add an alphabar to the homepage, so that people can click a letter to get a list of cities that start with the letter. Clicking on one of those would return the list of schools in the city, and so on. I may have missed something in the site, but I can't see any way of reaching any state, city, or detail pages without filling in the Search form, and spiders can't do that. <added> I've just seen your last 2 posts, and it looks like I've missed something in your site. Or maybe you've changed it to a form-only system since all the pages were crawled. |
|
||||
|
As we say about private education in England - "You can buy privilege, but you can't buy brains".
Solution: Links. And a logical site structure will help solve the problem.
__________________
Simply Clicks | SEO | SEO Training| Pay Per Click Advertising | Search Engine Powered Marketing |
|
||||
|
1. Get rid of the crawl delay
2. I still dont subscribe to page validaiton as reason for not ranking. I have seen far to many not valid pages rank well. 3. You lack of indexing probably has more to do with lack of content more than anything. You website is simply a directory full of links leading to the contact info for the private school. 4. You seem to be indexed and ranking on Google fine: http://www.google.com/search?hl=en&q...=Google+Search http://www.google.com/search?hl=en&l...chools+in+ohio Add some content, add some more content weekly (private school news blog?) and magic will happen. |
|
|||
|
Thanks for the help here. I will work on the site and try to correct some of the problems.
__________________
"You miss 100% of the shots you never take" - Wayne Gretsky http://www.eschoolsearch.com |
![]() |
|
| Thread Tools | |
| Display Modes | |
|
|
|
WebProWorld |
Advertise |
Contact Us |
About |
Forum Rules |
MVP's |
Archive |
Newsletter Archive |
Top |
WebProNews
WebProWorld is an iEntry, Inc. ® site - © 2009 All Rights Reserved Privacy Policy and Legal iEntry, Inc. 2549 Richmond Rd. Lexington KY, 40509 |