Quote:
Originally Posted by inertia
I meant noindex robots.txt directive. It just seems odd that Google or Matt C never mention it? Why would that be the case if it was such a useful tool that would benefit everyone including Google?
|
I guess that Matt does not praise it because it is still unofficially supported. If I do not recall, Adam Lasnik talked about it a while ago. If I find the link I will post it here.
Quote:
Originally Posted by inertia
Can i just clarify... If i block a page in robots.txt with the NOINDEX directive will that stop the page being crawled, indexed and also stop it building or leaking pagerank?
|
If you block a page in the robots.txt with the noindex directive it will not stop Googlebot crawling the pages, but Google will not index it.
What is the difference with the disallow directive?
In case someone is linking to a page you block with the disallow directive in the robots.txt, Googlebot will return the reference in their in their index but without a snippet. And there you have a
PR leak, since
PR will be assigned to that page too.
If you block that page with the noindex directive, the reference will not show up at all. And
PR will not be assigned to that page, but still it will pass to other pages you link from there. But that to happen, you must have on that page at least one outbound (internal or external) link without a nofollow or so ever, so the
PR can move ahead.
If not, then you will create dangling/nodes (dead end or hanging pages), in other words as Wige said above a
PR black-hole.