PDA

View Full Version : Finding yet to be crawled URL's



crankydave
03-21-2006, 03:45 PM
We know Google stores yet to be crawled URL's in both the regular and supplemental index. According to Matt Cutts (http://www.mattcutts.com/blog/googlebot-keep-out/#comment-18207)


Hmm. A more correct way to put it would be that there is a regular Googlebot and a supplemental Googlebot (though their user agents will be the same), and uncrawled urls from the regular Googlebot will go in the regular index while uncrawled urls from the supplemental Googlebot will go in the supplemental index. Hope that makes sense; I believe that’s correct.

We also know that yet to be crawled URL's are displayed in the search results as well.

Where are these yet to be crawled URL's from the supplemental index in the search results?

Where are these yet to be crawled URL's in the supplemantal index when using the site: operator?

Can someone please show me a single example of a yet to be crawled URL from the supplemental index that is marked as "supplemental"?

C'mon folk we know that they are there.

No?

Perhaps they're not marked as supplemental.

Dave

crankydave
03-22-2006, 08:27 AM
In Matt's blog, somebody said that there are plenty of URL-only listings that are marked as supplemental, but I've never noticed one way or the other.

I saw that. However, the URL-only listings they were pointing out show a cache, indicating they have been already crawled.

If I'm right, nobody can show me one.

Dave