iEntry 10th Anniversary Forum Rules Search
WebProWorld
Register FAQ Calendar Mark Forums Read
Insider Reports Anyone is welcome to reply and discuss but starting new topics is reserved for WebProWorld staff and MVPs.

Share Thread: & Tags

Share Thread:

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 04-05-2004, 06:26 PM
Garrett's Avatar
WebProWorld Veteran
 
Join Date: Jun 2003
Location: Lexington, KY. USA
Posts: 316
Garrett RepRank 0
Default Google Indexes Document's First 100k

How big are your web pages? If you're creating especially long, text-heavy pages you might consider breaking up your site into smaller pieces. Into 100k size pieces to be exact, according to GoogleGuy.

Mark Carey reported that GoogleGuy said, "we'll typically index the first 101K of a web page -- in practice, more content of a page can be indexed (e.g. PDFs), but if you keep your main content under 100K or so, that's the safest.

Remember that Google's not indexing your images (well, they are, but not in the same index as their web pages), so a page that's over 100k is enormous.

If your pages run over 100k without images you should find a way to break them up some. There's a good chance they're hard for your site visitors to navigate anyhow.

If you absolutely have to have more than 100k on a page, make sure the indexibles are above the 100k line.
__________________
Garrett French
Editor, WebProNews.com
http://www.WebProNews.com
Reply With Quote
  #2 (permalink)  
Old 04-05-2004, 07:55 PM
Mark Carey's Avatar
WebProWorld New Member
 
Join Date: Apr 2004
Posts: 15
Mark Carey RepRank 0
Default

While the 101K limit has been known for some time, there is a debate about whether Google crawls beyond the 101K mark on page. For example suppose a page is 150K in size consisting of mostly links. Will Google simply stop crawling the page after 101K, thus not following the links at the bottom of the page? Or, does Google index only the first 101K, but continue to follow the remainder of the links on the page? I have read claims on both sides of the debate, but never tried to test it myself. The answer can have a significant impact on large sitemaps. Nobody cares if the entire 200K sitemap in indexed, but we certainly care that all of the links are crawled.
Reply With Quote
  #3 (permalink)  
Old 04-06-2004, 04:10 PM
WebProWorld Veteran
 
Join Date: Feb 2004
Location: Lodz, Poland
Posts: 328
adore RepRank 0
Default

When talking about such big sitemaps, there's another question - if such a big number of links could be spidered. As you probably know, the suggestions are that there should be no more than 100 links at one site. Are they spidered or ignored? It's difficult to say.
__________________
http://www.twojecentrum.pl - Polish e-shopping center
http://dzwonki-loga.pl - Ringtones for mobile phones
Reply With Quote
  #4 (permalink)  
Old 04-14-2004, 09:11 AM
WebProWorld New Member
 
Join Date: Mar 2004
Location: Poland
Posts: 19
Riklaunim RepRank 0
Default

Some time ago (1-2 months) I've made a simple HTML page that was something lika a site map. I've put on it links to all articles on my page like:

- Category
{Blockquote here}-link: small descr.{/blockquote}
{Blockquote here}-link: small descr.{/blockquote}

There was more than 200 dynamic links. The page had only Title, and Robots Index/Follow. It got listed on google and some other search engines. And as I noticed google did followed those dynamic links indexing links to forum categories etc. :)
Reply With Quote
Reply

  WebProWorld > Search Engines > Insider Reports

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -4. The time now is 12:51 AM.



Search Engine Optimization by vBSEO 3.3.0