I happened to notice this thread quite by accident (I was in the breakroom and read the rant), and I think I might have at least a semi-reasonable way to solve it.
I apologize in advance for the length of this, but I wanted to post my idea, my prediction, and my logic (which seems to be, in at least some cases throughout this thread, lacking).
Here's the idea, in a nutshell:
- Pick out a key phrase (or two or three if you want to be really sure) that has a relatively low competition (say, 10,000-100,000 results).
- Create a single static page optimized for said keyphrase.
- Do not submit the page to Google, but rather to directories, indices, forums with it in your sig, whatever. Basically, get IBLs for the page, and do so steadily for a long enough period of time to gain an accurate trend analysis (I'd use 3 months myself.)
- Do not change any aspect of the page (you shouldn't have to if it's optimized anyway, but I digress).
This would create a controlled experiment with only two variables:
- The number of IBLs Google has to play with and calculate PR from (which is the important variable).
- The number of results, and the altering of pages and increasing of IBLs and PR of said IBLs to get those results (which you really can't do anything about, but if you run an aggressive enough IBL campaign of your own, then the effect of this behaviour would be relatively nullified.)
Here's my predictions as far as what you'll find:
- If you get enough good IBLs, you'll end up in Google within 2-3 weeks (in some cases, I've seen exactly 21 hours for new pages, but 2-3 weeks is a conservative estimate.)
- There may be a few day-over-day drops in ranking for the keyphrase(s) mentioned above, but the ranking for the keyphrase(s) will climb steadily as a whole over the 2-3 month period, as opposed to one massive jump.
tomzo, I took a look at your sites and based on them, I don't see any programming. So I'm going to make the assumption that you're not a programmer. And if you are a programmer, I'm sure there's at least one person reading this thread who isn't, and would find my working theory on how Google works (based on how database-driven sites in general work).
Important database-driven website rule that will come into play later on in this explanation: Open/close the database as few times as possible. In other words, open up your database, read/edit/add/delete your information, close the database, and destroy any objects that may have been used in this process and thus free up the RAM used to open and do your database work.
Now...how does this apply to Google? Well...Google, being a search engine, has a database (or to put it more accurately, a series of them on a series of servers, but for simplicity's sake let's assume it's one gigantic monolithic thing) containing two tables with fields that are structured in a similar manner to the following:
Table 1
The URL of the page indexed
The Page info (meta tags, title, content, if the page is blacklisted, if the domain is blacklisted, if the IP or IP block is blacklisted, etc.)
Table 2
The URL of the page indexed (or another unique identifier joining it with table 1)
The URL of the IBL (or similar unique identifier)
The
PR assigned by said IBL.
Now that we've got our structure in place, let's see what happens when a new page is added to Google (or one is updated), step by step:
- The page contents are read by Google, and the appropriate information is placed into Table 1 (URL, etc.)
- The links are extracted (which is easy enough to do by finding all the a tags and traversing the href property.)
- For each link, Table 2 is updated with the new PR from the IBL. This is done one of two ways: in the case of a new IBL, a new record is created. In the case of an existing IBL, the PR is updated.
- In the case of an updated page, any links that were on the old page and removed from the new one will be in turn removed from Table 2, and their PR will longer be added in.
- The links from the new (or updated) page are traversed in turn (i.e. Google spiders the links), and Steps 1-4 from above are repeated. In the case of reciprocal links, there are a series of "back-and-forth" iterations between the two pages Google will go through before it eventually stops and says "I've had enough" (I believe this number is 50, but I'm not sure.)
Now, based on those steps above, we can clearly see that the
PR would be updated on a continuous basis as Google traverses new (and updated) pages on a continuous basis. If a page were indexed, and then every X number of days/weeks/months, the database record containing the page was reopened and the
PR was calculated separately, this would create an extra database call, as well as an extra trip for the spiders to traverse links, and that would be inefficient.
So, by constantly internally updating
PR, the
PR of a site stays fresh, the IBLs are taken into consideration, and the SERPs stay (relatively) relevant.
Now...why doesn't this information show externally if it's updated on a continuous basis?
The answer to this is relatively simple. By revealing this information, Google provides a means for "get-rich-quick" SEOs to be able to measure their marketing efforts simply by measuring the quality of their campaigns by the increase/decrease/disappearance of
PR, thus making an
SEO's job easier. And Google, in its rather pandering "don't be evil" approach, will not exactly go out of their way to make an
SEO's job easier.
So what's the external
PR, and where does it come from?
As we all know, the Google external
PR is a 1-10 logarithmic curve designed to show the
PR of a site. This would give an online marketer an
approximate indicator of whether or not (s)he would be justified in trying to gain either an IBL or a reciprocal link from said website.
In doing so, Google acknowledges that an IBL from an "authority" site is worth more than one from JoeBob's Personal Dancing Jesus and 50,000 Animated GIFs on a Rainbow Background for No Apparent Reason Site. But it doesn't go so far as to say "how much weight", since doing so would simply create chaos among the top sites (which it probably has already, but it would be even worse by showing internal
PR).
So...why doesn't this get updated that often? Based on the logic above, it would appear that the external update to the
PR would be best served when the internal
PR is updated, since it would be the same database call. Or would it?
No. It wouldn't. The difference between internal and external
PR is that internal
PR is open-ended (since there is nearly no limits to the number of IBLs a site can gain) whereas external
PR is closed-ended (0-10), likely based on the internal
PR (where the site with the highest internal
PR would be 10 and the logarithm would calculate accordingly).
Now...here's where the important database rule comes into play. Let's say for argument's sake that the site with the highest internal
PR is amazon.com . Amazon.com is forever gaining IBLs due to its affiliate program, people putting links on journal sites, blogs, news sites, directories, and about a zillion other ways. These links are being traversed by Google all the time. Naturally, this means amazon.com's
PR goes up every time a new IBL is discovered as well.
Let's say that Google were to update the external
PR of every page in the Google database, based on the new
PR of amazon.com, every single time they get a
PR increase. A simple search of the word "a" on Google reveals approx. 2,810,000,000 (that's 2.8 BILLION) results. And there are probably more pages out there that don't contain the word "a".
amazon.com gets quite a few IBLs each day. Imagine if they got some combination of IBL
PR from new URLs or IBL
PR increases from existing URLs totalling 100 per day. (Based on amazon.com's reach, it's realistically a LOT more, but I wanted to make a conservative estimate). That means 100 times a day, over 2.8 billion records would need to have their external
PR updated, based on amazon.com's new
PR.
For those who don't know, updating that many records at a time is a serious resource drain, no matter what database you're using.
Based on our important databse rule above, we want to do this as little as possible while still keeping things up to date. So what makes sense? An update every 3-4 months seems reasonable enough to get the external update done. It's basically cosmetic at this point anyway.
To summarize:
- Internal PR updated continuously.
- External (hereafter to be referred to by me as cosmetic) PR updated infrequently to save strain on Google hardware.
Again, I'm basing this on how I feel as a programmer of some experience it would make the most sense.