iEntry 10th Anniversary Forum Rules Search
WebProWorld
Register FAQ Calendar Mark Forums Read
Google Discussion Forum Google Discussion forum is for topics specifically related to Google. There is a subforum dedicated to AdSense/AdWords subjects.

Share Thread: & Tags

Share Thread:

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 08-28-2009, 08:26 AM
Webnauts's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Aug 2003
Location: Worldwide
Posts: 8,169
Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9
Lightbulb Robots.txt can pass PageRank?

I have a question based on the following scenario.

Lets say I have a robots.txt like this:
Code:
User-agent: *
Disallow: /file1.html
Disallow: /file2.html
Disallow: /folder1/
Disallow: /folder2/

Sitemap: http://www.mywebsite.com/sitemap.xml
As we know Google follows the link to the sitemap.xml.

Then lets say I posted on several forums, blogs, etc a link to my robots.txt, and at some point my robots.txt have received PageRank. Like for example the WhiteHouse robots.txt or others.

Shouldn't the PageRank pass to the sitemap.xml file and from there also pass to the pages included in the sitemap?
__________________
"Being an expert isn't telling other people what you know. It's understanding what questions to ask, and flexibly applying your knowledge to the specific situation at hand. Being an expert means providing sensible, highly contextual direction." Jeff Atwood
SEO Workers - Search Engine Optimization Consulting Company | SEO Analysis Tool | Webnauts Net SEO

Last edited by Webnauts; 08-28-2009 at 08:35 AM.
Reply With Quote
  #2 (permalink)  
Old 08-28-2009, 08:29 AM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,709
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: Robots.txt can pass PageRank?

Doesn't this

How Google Finds Your Needle in the Web's Haystack

article answer that question?
Reply With Quote
  #3 (permalink)  
Old 08-28-2009, 08:31 AM
Webnauts's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Aug 2003
Location: Worldwide
Posts: 8,169
Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by kgun View Post
Doesn't thisHow Google Finds Your Needle in the Web's Haystack

article answer that question?
Lets keep the answer simple man.

Yes or no?
__________________
"Being an expert isn't telling other people what you know. It's understanding what questions to ask, and flexibly applying your knowledge to the specific situation at hand. Being an expert means providing sensible, highly contextual direction." Jeff Atwood
SEO Workers - Search Engine Optimization Consulting Company | SEO Analysis Tool | Webnauts Net SEO
Reply With Quote
  #4 (permalink)  
Old 08-28-2009, 08:36 AM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,709
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by Webnauts View Post
Shouldn't the PageRank pass to the sitemap.xml file and from there also to the pages included in the sitemap?
I see that as two questions. My answer to both are yes unless Google handle these files different from other files.
Reply With Quote
  #5 (permalink)  
Old 08-28-2009, 08:38 AM
WebProWorld Veteran
 
Join Date: May 2006
Location: ibiza
Posts: 386
kevsta RepRank 2kevsta RepRank 2
Default Re: Robots.txt can pass PageRank?

I thought it was only hyperlinks that pass PR?
Reply With Quote
  #6 (permalink)  
Old 08-28-2009, 08:40 AM
caravan's Avatar
WebProWorld Pro
 
Join Date: May 2006
Location: Preston, Lancashire, UK
Posts: 103
caravan RepRank 1
Default Re: Robots.txt can pass PageRank?

I understand what you're saying and it could be possible but I'm just wondering if the bots would treat the link to the sitemap.xml differently as a) it's the robots.txt file and b) it's not a html hyperlink. Does the whitehouse robots.txt file show any page rank in the toolbar?
Reply With Quote
  #7 (permalink)  
Old 08-28-2009, 08:41 AM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,709
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by kevsta View Post
I thought it was only hyperlinks that pass PR?
More precisely hyperlinks on those documents may pass pagerank. A web page has pagerank [0,n]

Last edited by kgun; 08-28-2009 at 08:51 AM.
Reply With Quote
  #8 (permalink)  
Old 08-28-2009, 08:43 AM
WebProWorld New Member
 
Join Date: Nov 2006
Location: Belgium
Posts: 24
Laurentvw RepRank 0
Default Re: Robots.txt can pass PageRank?

No. People could be linking to the robots.txt to show that the website is hiding some pages, and whatnot.
If I have 1 page on my site (frontpage) + robots.txt. And the only thing that's popular about the site is the robots.txt. That doesn't mean the frontpage is worth anything. Of course, this is theoretically speaking.
__________________
CJReport
Reply With Quote
  #9 (permalink)  
Old 08-28-2009, 08:44 AM
inertia's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Apr 2006
Location: Lancaster, UK
Posts: 1,022
inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6
Default Re: Robots.txt can pass PageRank?

Google doesnt "follow the link to the sitemap". It just acknowledges from the robots.txt standard for sitemaps that the sitemap is located here... It's not a proper indexable anchor, it's just text.
__________________
Latest Blog Post: Google Consultant - Should this Job Title be Allowed? - Matt Inertia's SEO Blog - SEOers.org

"Carpe diem, seize the day boys, make your lives extraordinary"
- Dead Poets Society
Reply With Quote
  #10 (permalink)  
Old 08-28-2009, 08:44 AM
Webnauts's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Aug 2003
Location: Worldwide
Posts: 8,169
Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by kevsta View Post
I thought it was only hyperlinks that pass PR?
For my understanding about PR yes. And in that case, the robots.txt should be a dead end or node page. Right?
__________________
"Being an expert isn't telling other people what you know. It's understanding what questions to ask, and flexibly applying your knowledge to the specific situation at hand. Being an expert means providing sensible, highly contextual direction." Jeff Atwood
SEO Workers - Search Engine Optimization Consulting Company | SEO Analysis Tool | Webnauts Net SEO
Reply With Quote
  #11 (permalink)  
Old 08-28-2009, 08:45 AM
Webnauts's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Aug 2003
Location: Worldwide
Posts: 8,169
Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by inertia View Post
Google doesnt "follow the link to the sitemap". It just acknowledges from the robots.txt standard for sitemaps that the sitemap is located here... It's not a proper indexable anchor, it's just text.
I agree with that too.
__________________
"Being an expert isn't telling other people what you know. It's understanding what questions to ask, and flexibly applying your knowledge to the specific situation at hand. Being an expert means providing sensible, highly contextual direction." Jeff Atwood
SEO Workers - Search Engine Optimization Consulting Company | SEO Analysis Tool | Webnauts Net SEO
Reply With Quote
  #12 (permalink)  
Old 08-28-2009, 08:48 AM
caravan's Avatar
WebProWorld Pro
 
Join Date: May 2006
Location: Preston, Lancashire, UK
Posts: 103
caravan RepRank 1
Default Re: Robots.txt can pass PageRank?

The whitehouse robots.txt has a toolbar PR 5 so it's obviously being given some credit. I suppose the question now is would it pass it on. As a link to sitemap.xml is not a valid html hyperlink I wouldn't have thought so.
Reply With Quote
  #13 (permalink)  
Old 08-28-2009, 08:50 AM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,709
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by inertia View Post
Google doesnt "follow the link to the sitemap". It just acknowledges from the robots.txt standard for sitemaps that the sitemap is located here... It's not a proper indexable anchor, it's just text.
That seem logical and it makes it more clear.
Reply With Quote
  #14 (permalink)  
Old 08-28-2009, 09:03 AM
WebProWorld Veteran
 
Join Date: May 2006
Location: ibiza
Posts: 386
kevsta RepRank 2kevsta RepRank 2
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by Webnauts View Post
For my understanding about PR yes. And in that case, the robots.txt should be a dead end or node page. Right?
well yes in the unlikely event of enough people linking to your robots text to give it PR, I cant see it helping very much anyway
Reply With Quote
  #15 (permalink)  
Old 08-28-2009, 09:18 AM
Terry Van Horne's Avatar
WebProWorld Veteran
 
Join Date: Apr 2008
Location: Toronto On., Ca.
Posts: 471
Terry Van Horne RepRank 4Terry Van Horne RepRank 4Terry Van Horne RepRank 4Terry Van Horne RepRank 4
Default Re: Robots.txt can pass PageRank?

There has been a thought among SEO's that Google sees a text url as a link whether it passes PR... not known. What many don't get is Google is looking for "citations" not links a text link reference is a citation. If you don't know what a citation is... dictionaries and google are quite helpful!

I do checks for IBL's using a search of the domain that way I get all the citations and one of the clues for me was finding links in the index that were text when the link: operator provided better results. It also finds all the pages where the site is being mentioned not just linked. Always find some bad shite that way.

How does a Robots .txt get PR? from people linking to it. Therefore it makes sense for the whitehouse robots.txt to get that many links because of what's in it and who it's for. anyone else.... IMO, raises a big flag for spam. A few links maybe but if you are linking to it from your own site... you're whacked or...

What the PR5 for WH indicates is Google does index these, I always assumed they were read for the crawler information and dropped. Why Index it... it's not like there is any reason to unless the links giving it PR causes Google to index it. Seems so
__________________
Follow me on Twitter! On the Trail with SOSG How I became a Social Media Convert and Twitter and Agents of Influence and now regular poster at Cloudmixer where We're Mixing New Media Ideas.

Last edited by Terry Van Horne; 08-28-2009 at 09:22 AM.
Reply With Quote
  #16 (permalink)  
Old 08-28-2009, 09:48 AM
Webnauts's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Aug 2003
Location: Worldwide
Posts: 8,169
Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9
Default Re: Robots.txt can pass PageRank?

You made some great points Terry.

And now my question is: How can robots.txt get PR assigned? If it does not pass PR somewhere, then the robots.txt it is a node (dead end) page. And Andy Beard explained about that on another thread here: Dangling Pages

So what's next?
__________________
"Being an expert isn't telling other people what you know. It's understanding what questions to ask, and flexibly applying your knowledge to the specific situation at hand. Being an expert means providing sensible, highly contextual direction." Jeff Atwood
SEO Workers - Search Engine Optimization Consulting Company | SEO Analysis Tool | Webnauts Net SEO
Reply With Quote
  #17 (permalink)  
Old 08-28-2009, 10:19 AM
wige's Avatar
Moderator
WebProWorld Moderator
 
Join Date: Jun 2006
Location: United States
Posts: 2,661
wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9
Default Re: Robots.txt can pass PageRank?

Hm... next is the question of whether your sitemap can pass pagerank I suppose.

My guess would be no, because since the sitemap format was created for the purpose of giving search engines an index, I could see Google intentionally not putting it into the pagerank algorithm the way they do for, say, RSS feeds.
__________________
The best way to learn anything, is to question everything.
Reply With Quote
  #18 (permalink)  
Old 08-28-2009, 10:22 AM
inertia's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Apr 2006
Location: Lancaster, UK
Posts: 1,022
inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6
Default Re: Robots.txt can pass PageRank?

Quote:
If you don't know what a citation is... dictionaries and google are quite helpful!
Id be very surprised if anyone here doesn't know what a citation is and even more surprised if they don't know how to find out!

I don't see how URLs simply written as text can be classed as PR passing citations. Think about the potential for abuse there would be?
__________________
Latest Blog Post: Google Consultant - Should this Job Title be Allowed? - Matt Inertia's SEO Blog - SEOers.org

"Carpe diem, seize the day boys, make your lives extraordinary"
- Dead Poets Society
Reply With Quote
  #19 (permalink)  
Old 08-28-2009, 10:32 AM
Webnauts's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Aug 2003
Location: Worldwide
Posts: 8,169
Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by wige View Post
Hm... next is the question of whether your sitemap can pass pagerank I suppose.

My guess would be no, because since the sitemap format was created for the purpose of giving search engines an index, I could see Google intentionally not putting it into the pagerank algorithm the way they do for, say, RSS feeds.
Good point Wige. You were faster than me, because that was my next question.
The sitemap uses for URLs <loc>....</loc>. If that is not accurate what Terry mentioned above about citations, then most probably Google would not pass PR.

About the RSS feeds issue, I prefer using RDF instead: http://www.seoworkers.com/index.rdf And it is included in Feedburner http://feeds.seoworkers.com/seo-webdesign-ecommerce

Those links can been definetely followed by the bots, as there is used <link>....</link>.
I hope that no one will disagree that Google undesrtands XHTML 2.0.

So here is my next thought. Why not creating a sitemap in RDF? I think I should give that a try. What do you think?
__________________
"Being an expert isn't telling other people what you know. It's understanding what questions to ask, and flexibly applying your knowledge to the specific situation at hand. Being an expert means providing sensible, highly contextual direction." Jeff Atwood
SEO Workers - Search Engine Optimization Consulting Company | SEO Analysis Tool | Webnauts Net SEO

Last edited by Webnauts; 08-28-2009 at 10:34 AM.
Reply With Quote
  #20 (permalink)  
Old 08-28-2009, 10:38 AM
inertia's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Apr 2006
Location: Lancaster, UK
Posts: 1,022
inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6
Default Re: Robots.txt can pass PageRank?

@wige...

What's confusing me is this... Google has assigned PR to a .txt file, just like it assigns PR to a .pdf file. But isnt the reality that Google assigns PR to a url rather than a file?

Google index all sorts of files but are they selective in which files they assign PR to... So, for example, if i was to fire a load of links to domain.com/image.jpg that url wouldn't show a TBPR when called?

Or it could just be a case that we're only used to seeing URLs with inbound links as showing PR but in actual fact PR can be assigned to any url with inbound links, no matter what the file type is at the end?

Sorry if that makes no sense!!!
__________________
Latest Blog Post: Google Consultant - Should this Job Title be Allowed? - Matt Inertia's SEO Blog - SEOers.org

"Carpe diem, seize the day boys, make your lives extraordinary"
- Dead Poets Society

Last edited by inertia; 08-28-2009 at 10:42 AM.
Reply With Quote
  #21 (permalink)  
Old 08-28-2009, 11:04 AM
Webnauts's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Aug 2003
Location: Worldwide
Posts: 8,169
Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by inertia View Post
@wige...

What's confusing me is this... Google has assigned PR to a .txt file, just like it assigns PR to a .pdf file. But isnt the reality that Google assigns PR to a url rather than a file?
Posed that way you are right. Maybe the wording was not correct. Maybe we should say accumulate?

Quote:
Originally Posted by inertia View Post
Google index all sorts of files but are they selective in which files they assign PR to... So, for example, if i was to fire a load of links to domain.com/image.jpg that url wouldn't show a TBPR when called?
What Google claims on that page is not fully correct. I know that they also index .rdf files. Mine were indexed until I added a noindex via X-Robots in my .htaccess and requested via GWT their deletion.

Quote:
Originally Posted by inertia View Post
Or it could just be a case that we're only used to seeing URLs with inbound links as showing PR but in actual fact PR can be assigned to any url with inbound links, no matter what the file type is at the end?

Sorry if that makes no sense!!!
You are making a very good point Matt.
__________________
"Being an expert isn't telling other people what you know. It's understanding what questions to ask, and flexibly applying your knowledge to the specific situation at hand. Being an expert means providing sensible, highly contextual direction." Jeff Atwood
SEO Workers - Search Engine Optimization Consulting Company | SEO Analysis Tool | Webnauts Net SEO

Last edited by Webnauts; 08-28-2009 at 11:07 AM.
Reply With Quote
  #22 (permalink)  
Old 08-28-2009, 11:05 AM
wige's Avatar
Moderator
WebProWorld Moderator
 
Join Date: Jun 2006
Location: United States
Posts: 2,661
wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9
Default Re: Robots.txt can pass PageRank?

I think you are right on target - Google assigns PR blindly to the URL, regardless of what the destination file is. For example, if a file has no extension and is blocked with robots.txt, that file still accrues PageRank. This has been observed in the past when different bugs have allowed such pages to show up in the search results. However, don't forget that you find out what the TBPR of a page is by asking Google - they can choose not to respond, as it seems is the case for files that are blocked by robots.txt.

Also, I would not be suprised an image does not collect PR unless the image has a direct link to it's URI. I can't see a reason why <img src= would pass PR, since it does not represent a location that the user can click through to.
__________________
The best way to learn anything, is to question everything.
Reply With Quote
  #23 (permalink)  
Old 08-28-2009, 11:36 AM
inertia's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Apr 2006
Location: Lancaster, UK
Posts: 1,022
inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6
Default Re: Robots.txt can pass PageRank?

Quote:
I would not be suprised an image does not collect PR unless the image has a direct link to it's URI. I can't see a reason why <img src= would pass PR, since it does not represent a location that the user can click through to.
Agreed, "src=" pulls the image on page where as "href=" (directly to the image file) would create an actionable request for a new URL and therefore a URL that deserves PR.

I've been a big user of stumble upon for a few years and regularly "stumble" through funny images but I've never (from what I can remember) seen an image URL with PR. But then again, I've never looked, so i may have missed it!

What I'm still trying to get my head around is the "dangling nodes" issue as John says. I'm also now thinking of the ranking issues that this throws up...

Quote:
What Google claims on that page is not fully correct. I know that they also index .rdf files. Mine were indexed until I added a noindex via X-Robots in my .htaccess and requested via GWT their deletion.
It looks like an old page now. Still, that's the first time I've read that page and I was quite surprised to find out that Google indexed Office file types. You learn something new every day!
__________________
Latest Blog Post: Google Consultant - Should this Job Title be Allowed? - Matt Inertia's SEO Blog - SEOers.org

"Carpe diem, seize the day boys, make your lives extraordinary"
- Dead Poets Society

Last edited by inertia; 08-28-2009 at 11:45 AM.
Reply With Quote
  #24 (permalink)  
Old 08-28-2009, 11:46 AM
WebProWorld Veteran
WebProWorld MVP
 
Join Date: Oct 2006
Posts: 905
innominds RepRank 5innominds RepRank 5innominds RepRank 5innominds RepRank 5innominds RepRank 5innominds RepRank 5
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by Webnauts View Post
Lets keep the answer simple man.

Yes or no?
Of course, Yes!
Reply With Quote
  #25 (permalink)  
Old 08-28-2009, 11:59 AM
inertia's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Apr 2006
Location: Lancaster, UK
Posts: 1,022
inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6
Default Re: Robots.txt can pass PageRank?

Quote:
Of course, Yes!
It's a great answer... but its wrong.
__________________
Latest Blog Post: Google Consultant - Should this Job Title be Allowed? - Matt Inertia's SEO Blog - SEOers.org

"Carpe diem, seize the day boys, make your lives extraordinary"
- Dead Poets Society
Reply With Quote
  #26 (permalink)  
Old 08-28-2009, 12:18 PM
Orion's Avatar
WebProWorld Veteran
WebProWorld MVP
 
Join Date: Sep 2003
Location: Halton Hills, ON
Posts: 702
Orion RepRank 4Orion RepRank 4Orion RepRank 4Orion RepRank 4
Default Re: Robots.txt can pass PageRank?

On the robots.txt I don't think it passes any PR on.. it's just not logical, and if by chance it's a hole it'll be plugged I'm sure.

Loved your reference, Terry, to the citations:

Quote:
Originally Posted by Terry Van Horne View Post
I do checks for IBL's using a search of the domain that way I get all the citations and one of the clues for me was finding links in the index that were text when the link: operator provided better results. It also finds all the pages where the site is being mentioned not just linked. Always find some bad shite that way.
that's one of the reasons I started using my formal business name as much (Orion's Web) and started referencing my business more as orionsweb.net, even without the www. it still has some validity... Can't be used for every site / business, however it's helpful if you can, great from a marketing standpoint too.
Reply With Quote
  #27 (permalink)  
Old 08-28-2009, 12:26 PM
crankydave's Avatar
Moderator
WebProWorld Moderator
 
Join Date: Aug 2004
Location: Playing with fire!
Posts: 4,254
crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9
Default Re: Robots.txt can pass PageRank?

This has been an interesting discussion.

We do know that Google will use a URL in text form for discovery. I guess it matters on whether or not Google defines a "link" as a URL or a URL that can be "clicked". A text URL can be followed by a surfer. Not as easily as a "clickable" one but it can definitely be followed.

Dave
Reply With Quote
  #28 (permalink)  
Old 08-28-2009, 12:32 PM
Webnauts's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Aug 2003
Location: Worldwide
Posts: 8,169
Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by inertia View Post
What I'm still trying to get my head around is the "dangling nodes" issue as John says. I'm also now thinking of the ranking issues that this throws up...
Matt have a look at this: SEO Linking Gotchas Even The Pros Make

Then I am sure you will understand what the dangling pages are about and why I use the noindex in my robots.txt for a site level and noindex meta tags on a page level.
__________________
"Being an expert isn't telling other people what you know. It's understanding what questions to ask, and flexibly applying your knowledge to the specific situation at hand. Being an expert means providing sensible, highly contextual direction." Jeff Atwood
SEO Workers - Search Engine Optimization Consulting Company | SEO Analysis Tool | Webnauts Net SEO
Reply With Quote
  #29 (permalink)  
Old 08-28-2009, 12:43 PM
inertia's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Apr 2006
Location: Lancaster, UK
Posts: 1,022
inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6inertia RepRank 6
Default Re: Robots.txt can pass PageRank?

Quote:
Then I am sure you will understand what the dangling pages are about and why I use the noindex in my robots.txt for a site level and noindex meta tags on a page level.
I know what a dangling page is and the theory behind it cheers john. What i'm not sure about is how that relates to the current situation...

When you look at my theory above which suggests that PR is assigned to a URL (and not a file) and you look at the FACT that a robots.txt file has been assigned PR via inbound links then it makes sense, until you bring the dangling nodes issue into play! At that point my theory goes into a redirect loop, crashes and explodes! Because there SHOULD be no way for a robots.txt file to accrue pagerank, because it is a dangling page!

Now, that makes me think that maybe text urls do pass pagerank and the "sitemap:www.domain.com" link in the robots.txt file is stopping it become a dangling page. But then i check out the Whitehouse robots.txt, which still has loads of PR but doesnt have the sitemap link?
__________________
Latest Blog Post: Google Consultant - Should this Job Title be Allowed? - Matt Inertia's SEO Blog - SEOers.org

"Carpe diem, seize the day boys, make your lives extraordinary"
- Dead Poets Society

Last edited by inertia; 08-28-2009 at 12:49 PM.
Reply With Quote
  #30 (permalink)  
Old 08-28-2009, 01:03 PM
wige's Avatar
Moderator
WebProWorld Moderator
 
Join Date: Jun 2006
Location: United States
Posts: 2,661
wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9wige RepRank 9
Default Re: Robots.txt can pass PageRank?

Why would dangling pages not accrue pagerank? Pagerank is literallly the odds that a surfer, randomly clicking links, will arrive on the page in question. If a page has links to it, it will accrue pagerank, regardless of whether or not it links to other pages. This is part of why I don't believe text links send pagerank (other parts include the likelyhood of text URLs being spam, the possibility they could be misread and affect the rest of the outgoing pagerank, etc)
__________________
The best way to learn anything, is to question everything.
Reply With Quote
  #31 (permalink)  
Old 08-28-2009, 03:39 PM
morestar's Avatar
WebProWorld Veteran
WebProWorld MVP
 
Join Date: Jun 2007
Location: Burlington, Ontario (Toronto)
Posts: 994
morestar RepRank 5morestar RepRank 5morestar RepRank 5morestar RepRank 5morestar RepRank 5morestar RepRank 5
Default Re: Robots.txt can pass PageRank?

in the end wige is right, files that are linked to can accrue page rank but only files that link out can pass pagerank.



link out being the operative words, not displaying paths to pages.
__________________
Join free dating sites and meet single people without paying a penny.

Last edited by morestar; 08-28-2009 at 03:40 PM. Reason: addendum
Reply With Quote
  #32 (permalink)  
Old 08-28-2009, 03:45 PM
SemAdvance's Avatar
WebProWorld Veteran
 
Join Date: Dec 2005
Location: In Your Mind
Posts: 790
SemAdvance RepRank 3SemAdvance RepRank 3SemAdvance RepRank 3
Default Re: Robots.txt can pass PageRank?

I think any page can have PR if we look at the root of what a webpage is which is a web log.

So a robots.text would also be a web log in essence.

Since there is a link on the page, (robots crawl pages) and links (which robots happen to follow) pointing to the file / page (web log) it would pass PageRank value based on the mathematical aspects alone.

Citation wise the robots.text is meritless, but an algorithm works in a set way, and does not have the ability to speculate.

Good post!
Reply With Quote
  #33 (permalink)  
Old 08-28-2009, 03:56 PM
SemAdvance's Avatar
WebProWorld Veteran
 
Join Date: Dec 2005
Location: In Your Mind
Posts: 790
SemAdvance RepRank 3SemAdvance RepRank 3SemAdvance RepRank 3
Default Re: Robots.txt can pass PageRank?

"The popularity of PDF results led us to expand the list of file types searched to include documents produced in a dozen formats such as Microsoft Word, Excel and PowerPoint."
Reply With Quote
  #34 (permalink)  
Old 08-28-2009, 05:28 PM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,709
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by Webnauts View Post
Then I am sure you will understand what the dangling pages are about and why I use the noindex in my robots.txt for a site level and noindex meta tags on a page level.
  1. To me that is best explained in the article linked to in my first post. Robots.txt can pass PageRank?
  2. In DOM context there is a difference between
    - Text nodes
    - Leaf nodes
    - Ordinary element nodes.
  3. If I find a link on the SERP's pointing to the robots text file, I may remove that part of the link and look at the home page that can be of interest.

Last edited by kgun; 08-28-2009 at 05:31 PM.
Reply With Quote
  #35 (permalink)  
Old 08-28-2009, 06:23 PM
Doc's Avatar
Doc Doc is online now
WebProWorld Veteran
WebProWorld MVP
 
Join Date: Jun 2009
Location: Baja California
Posts: 695
Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9
Default Re: Robots.txt can pass PageRank?

This is one of the most interesting threads here in a good while! Although a lot of it is over my head, it's still been educational.

I can't help but get the impression that we sometimes tend to give the algorithms too much credit. They are impressive, but not omnipotent. If the algos are reading a text url as though it were a link, then I can't imagine it would be anything but deliberate on the codewriters' parts.

And I can't see any logical reason for them doing so...it would open a real can of worms. It seems to me to be more likely that a sitemap, for instance, would accrue, based upon its incoming traffic. It seems reasonable to me to think that it was deemed minor enough to not necessitate any special coding to avoid it. Or, a "hole", if you will.

And besides, there are already plenty of worms to go around.
__________________
If I ever stop learning, let the wolves have my carcass.
http://doccampbell.wordpress.com/
http://cleanstreamwaterconditioning.com
http://carforums-online.com
Reply With Quote
  #36 (permalink)  
Old 08-29-2009, 02:35 PM
WebProWorld New Member
 
Join Date: Jul 2009
Location: Cleveland, Ohio
Posts: 13
BoBoMisiu RepRank 2
Default Re: Robots.txt can pass PageRank?

I think Google maps everything on a site, if it's allowed.
I believe the sitemap pointer in the robots.txt can point to any file name because the sitemap protocol's xml tags define the sitemap or the sitemap index in those files.
But, I would never stray from common practice and rename them.
The robots.txt is an entry point but only if a crawler follows the rules, real people do not go to the robots.txt first.
The sitemap indexes the content of the site, the sitemap with or without styling may be set throught .htaccess as the default page so people may go to it first.
Pages identified in the sitemap may still be missing or not yet added to the sitemap.
In my opinion, if the sitemap indexes the robots.txt, then they point to each other and are reciprical.
I think the sitemap.xml passes pagerank to the robots.txt on such sites.
Reply With Quote
  #37 (permalink)  
Old 08-29-2009, 03:18 PM
Terry Van Horne's Avatar
WebProWorld Veteran
 
Join Date: Apr 2008
Location: Toronto On., Ca.
Posts: 471
Terry Van Horne RepRank 4Terry Van Horne RepRank 4Terry Van Horne RepRank 4Terry Van Horne RepRank 4
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by inertia View Post
What's confusing me is this... Google has assigned PR to a .txt file, just like it assigns PR to a .pdf file. But isnt the reality that Google assigns PR to a url rather than a file?
But http://www.whitehouse.gov/robots.txt is a URL is it not?
Quote:
Originally Posted by inertia View Post
Google index all sorts of files but are they selective in which files they assign PR to... So, for example, if i was to fire a load of links to domain.com/image.jpg that url wouldn't show a TBPR when called??
Obviously for images the mime type is different... remember .txt is the extension it is also a mime type. Guess what mime type HTML is? Text... I believe... which is likely why an image doesn't gather PR... though we really haven't proven it can't so...
Quote:
Originally Posted by inertia View Post
Or it could just be a case that we're only used to seeing URLs with inbound links as showing PR but in actual fact PR can be assigned to any url with inbound links, no matter what the file type is at the end?
Not sure about images because if you click on a link to an image a page appears... once again wige might know better how that link to an image gets put into a page. I don't have any experience with that... a little too technical for me.

My point is that mime types are what a SE crawler is programmed. The mime type is important for that reason.
__________________
Follow me on Twitter! On the Trail with SOSG How I became a Social Media Convert and Twitter and Agents of Influence and now regular poster at Cloudmixer where We're Mixing New Media Ideas.

Last edited by Terry Van Horne; 08-29-2009 at 03:21 PM.
Reply With Quote
  #38 (permalink)  
Old 08-29-2009, 03:26 PM
Doc's Avatar
Doc Doc is online now
WebProWorld Veteran
WebProWorld MVP
 
Join Date: Jun 2009
Location: Baja California
Posts: 695
Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9
Default Re: Robots.txt can pass PageRank?

It would seem logical that the SE's would eventually want to be able to rank images individually, but that's a pretty ambitious project. I can see where having images ranked would be advantageous to users on the prowl, but I wonder if there'd be any ROI for it, for the SE's. Probably so.
__________________
If I ever stop learning, let the wolves have my carcass.
http://doccampbell.wordpress.com/
http://cleanstreamwaterconditioning.com
http://carforums-online.com
Reply With Quote
  #39 (permalink)  
Old 08-29-2009, 11:18 PM
Clint1's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Jun 2005
Location: Louisiana, USA
Posts: 1,323
Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9
Default Re: Robots.txt can pass PageRank?

Why not just create an HTML sitemap and use that as your signature file link at blogs, forums, etc. That way the arguments as to whether or not a .txt file or .xml can pass PR are moot. Because we all know that .html files will.
__________________
Happy Thanksgiving to all & God Bless,
-Clint
(Join Date: 2003)
Reply With Quote
  #40 (permalink)  
Old 08-29-2009, 11:45 PM
Terry Van Horne's Avatar
WebProWorld Veteran
 
Join Date: Apr 2008
Location: Toronto On., Ca.
Posts: 471
Terry Van Horne RepRank 4Terry Van Horne RepRank 4Terry Van Horne RepRank 4Terry Van Horne RepRank 4
Default Re: Robots.txt can pass PageRank?

Now that I think about this further and have re-read the thread a third time, I think CD was right on the money that text links are used for discovery. IMO, likely don't pass any PR. Google uses everything from email applications toolbars whatever to discover new pages. Citations with value passing to other pages... not so sure partlty because often text links on a page are used so no value/benefit is passed to the page while being discussed... kinda why I messed up the Whitehouse URL. The best test would be to put a link in that file and see if it passed value... anyone want to tweet Mr. Obama and ask him if he'd like to participate in a little SEO testing?
__________________
Follow me on Twitter! On the Trail with SOSG How I became a Social Media Convert and Twitter and Agents of Influence and now regular poster at Cloudmixer where We're Mixing New Media Ideas.
Reply With Quote
  #41 (permalink)  
Old 08-30-2009, 03:10 PM
TheWebDoctor(tm)'s Avatar
WebProWorld Pro
 
Join Date: Jun 2003
Location: USA
Posts: 205
TheWebDoctor(tm) RepRank 1
Default Re: Robots.txt can pass PageRank?

PageRank is a determination of link popularity. PageRank cannot be passed via citation without the citation being an active HREF without a nofollow indication.

Any file delivered by a website can and does have PageRank. This includes all office type documents, Flash, PDF, images and other sorts of files.

Images are typically linked as SRC and can be linked to with HREF. When the image is linked to using HREF the image URL will acquire, at some point, PageRank.

File types called by OBJECT can be linked to with HREF. When the file is linked to using HREF the file URL will acquire, at some point, PageRank.

Robots.txt files can acquire PageRank from links to it from other files. Do a search on Yahoo for The Whitehouse's Robots.txt file and you find over 1700 links to that file. Site Explorer - Search Results This indicates that the file can be linked to by other pages and hence PageRank is passed to the URL.

To answer the initial question: Can Robots.txt files pass PageRank. The answer is unfortunately, NO! Robots.txt files are a set of computer instructions designed to instruct the crawlers to avoid or pay attention to specific files, file types, directories/folders and possibly the entire site. It cannot pass PageRank because it is not a document that can contain active links. PageRank can only be passed by (this is only a small sampling and not meant to be a complete list) HTML documents, Flash, PDF, Word type documents, Excel type documents, and others that can and do contain active links.
Reply With Quote
  #42 (permalink)  
Old 08-30-2009, 05:47 PM
Terry Van Horne's Avatar
WebProWorld Veteran
 
Join Date: Apr 2008
Location: Toronto On., Ca.
Posts: 471
Terry Van Horne RepRank 4Terry Van Horne RepRank 4Terry Van Horne RepRank 4Terry Van Horne RepRank 4
Default Re: Robots.txt can pass PageRank?

Webdoctor you make a lot of statements here with no reason for why they are true. I can say the moon is blue does that make it a fact? If I show you a picture of the moon and it is blue then it is somewhat fact. We need to then figure out why the moom was blue. I also disagree that it can't contain active links, I agree they may not work or be indexed, but there is no known reason they can't be there. To assign PR it is doing some sort of indexing doesn't it?


As far as what a Robots.txt does I don't know of many participating in the thread who don't know what it is for so if you think the dabate is about what we know it does... it's not. If you can show the people in the thread something other than your statement stated as fact with nothing to back it up... end of subject... until you can do that all you've done is tell us what we already knew. This discussion is about the possibilitites. This is easily tested I'm sure someone is doing it already. Thanks for the information though!
__________________
Follow me on Twitter! On the Trail with SOSG How I became a Social Media Convert and Twitter and Agents of Influence and now regular poster at Cloudmixer where We're Mixing New Media Ideas.

Last edited by Terry Van Horne; 08-30-2009 at 05:58 PM.
Reply With Quote
  #43 (permalink)  
Old 08-30-2009, 07:05 PM
crankydave's Avatar
Moderator
WebProWorld Moderator
 
Join Date: Aug 2004
Location: Playing with fire!
Posts: 4,254
crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by Terry Van Horne View Post
This discussion is about the possibilitites. This is easily tested I'm sure someone is doing it already. Thanks for the information though!
Not overly difficult to orphan some pages, place some text only url's here and there, watch WMT and see what you can see, watch other SE's and see what you can see, etc., etc., etc., ...

Dave
Reply With Quote
  #44 (permalink)  
Old 08-30-2009, 08:10 PM
TheWebDoctor(tm)'s Avatar
WebProWorld Pro
 
Join Date: Jun 2003
Location: USA
Posts: 205
TheWebDoctor(tm) RepRank 1
Default Re: Robots.txt can pass PageRank?

Terry Van Horne, nice to see you read the post.

If you have questions be specific about what you don't understand. I'll be happy to explain it to you and even point you to documentation on the subjects.

Google's PageRank patents clearly describe the need for an active link.

United States Patent: 6285999
and
United States Patent: 7058628


Since I don't know what you're confused about I can't explain or direct you to anything else. I'd be happy to explain further if you like. Just be clear in what you ask.

The question was "Does the robots.txt file pass PR?"

Google's PageRank patents clearly state that PageRank is a mathematical representation of reputation and importance passed from one page to another through hyperlinks, the robots.txt cannot pass PageRank values.

As to your question, "To assign PR it is doing some sort of indexing doesn't it?"

I'm going to assume that we've all seen in log files where search engines have requested a page and that page not show up in the search results. What did the search engine do with the page it requested? It put it in their database, but the page has not been parsed and moved to the publicly searchable database. When the search engine decides it wants to move the page to the publicly searchable database is determined by their rules of business.

Pages in Google's publicly searchable database do not always receive PageRank values. I have to assume that you have seen pages come up in the search results and still have a "grey bar" indicating no PR value has been determined for the page. This happens with other files as well.

If you do a search on Google for The Whitehouse robots.txt file, you will find they have two results. site:www.whitehouse.gov/robots.txt - Google Search

Doing this search results in Google indicating the robots.txt file has not been indexed. The file does exist and is visited regularly by Google.

Search engines only index files that have active hyperlinks pointing to them. Those active links must be of the HREF type not the REL type found in the header of a web page.

Just because it's indexed doesn't mean it's passing PR.

Google's robots.txt file is indexed and has a PR 4 value. http://www.google.com/robots.txt Searching Yahoo for links to Google's robots.txt file indicates there are over 2200 pages linking to Google's robots.txt file. Site Explorer - Search Results

Google's robots.txt file references http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml. However, doing a search for http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml which appears in Google's robots.txt file on Yahoo does not return Google's robots.txt file in the list of 93 files returned from the search. Since Google doesn't return a complete list of links we can only assume that Google does the same thing.

Did Yahoo index Google's robots.txt file? Most certainly. Doing a search for Google's robots.txt file on Yahoo does return the listing of Google's robots.txt file. Doing a search on Google also returns Google's robots.txt file. This is because of the active hyperlinks citing the file not because of the REL link in the header.

Still, to answer the question, "does pagerank get passed from a robots.txt file?" The answer is no. A hypertext link must exist first for pagerank to be passed. URL citations are not, as we all know, hypertext links unless an HREF exists for the citation to be a hyperlink.

If someone is testing the theory, I look forward to reading proof that a robots.txt file can get the search engines to index a non-hyperlinked page. I like learning new things.

I hope this helps.
Reply With Quote
  #45 (permalink)  
Old 08-31-2009, 02:00 AM
Clint1's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Jun 2005
Location: Louisiana, USA
Posts: 1,323
Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by TheWebDoctor(tm) View Post
PageRank is a determination of link popularity.
That's the brainwashing G would have you believe, and, by their own claim "the usefulness and importance of a website", but that's not exactly the case. Name.com, domainname.com, yourdomain.com, mydomain.com, example.com, etc., etc., etc., there's nothing "popular" about those domains. All of those types of domain names have outrageous ridiculously inflated PR just because people use them as URL examples in millions of forum and message board posts across the world. This is another reason why PR is so flawed and inaccurate.

BTW, Whitehouse.gov's robots.txt file is a PR5. Only ~1500 IBL's and a PR5. Must be some powerful 1500 websites. Probably like "example.com". LOL.

Aside from the annoying pop-up hang-around, the chart at the bottom may be interesting.
Robots.txt Tutorial
__________________
Happy Thanksgiving to all & God Bless,
-Clint
(Join Date: 2003)
Reply With Quote
  #46 (permalink)  
Old 08-31-2009, 02:39 AM
Doc's Avatar
Doc Doc is online now
WebProWorld Veteran
WebProWorld MVP
 
Join Date: Jun 2009
Location: Baja California
Posts: 695
Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9Doc RepRank 9
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by Clint1 View Post
BTW, Whitehouse.gov's robots.txt file is a PR5. Only ~1500 IBL's and a PR5.
That is interesting! Good catch, Clint. Kind of throws the whole thing back up in the air again, doesn't it?
__________________
If I ever stop learning, let the wolves have my carcass.
http://doccampbell.wordpress.com/
http://cleanstreamwaterconditioning.com
http://carforums-online.com
Reply With Quote
  #47 (permalink)  
Old 08-31-2009, 03:22 AM
Clint1's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Jun 2005
Location: Louisiana, USA
Posts: 1,323
Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9Clint1 RepRank 9
Default Re: Robots.txt can pass PageRank?

And FWIW, Lee mentioned G's robots.txt file was a PR4.
__________________
Happy Thanksgiving to all & God Bless,
-Clint
(Join Date: 2003)
Reply With Quote
  #48 (permalink)  
Old 08-31-2009, 10:27 AM
TheWebDoctor(tm)'s Avatar
WebProWorld Pro
 
Join Date: Jun 2003
Location: USA
Posts: 205
TheWebDoctor(tm) RepRank 1
Default Re: Robots.txt can pass PageRank?

The use of hyperlinks to determine the importance of a web page has existed prior to Google's patent. Google cites, in their patent or should I say Stanford's patents, that hyperlinks are the mechanism they use to determine if a citation is to be used for determining the relevance to a search query based upon the anchor text used.

What Google did and still does is determine their opinion of a page and then represent their opinion in a numerical value based upon a mathematical equation they created. Again, since a hyperlink is required and a robots.txt file cannot contain a hyperlink it cannot pass PR.

Google has never brainwashed me and never will.

Doing a search on Yahoo presents 1700+ results for links to the Whitehouse robots.txt file.

Those results include the likes of CNET, BoingBoing, Huffington Post, Plagiarism Today, Internet News, The AGE, Digg, BBC, Computer World, Network World, Search Engine Land, and many other top sites.

Whether domain.com, example.com or any other website, in one's opinion, has an over inflated PR value is not the question here. Domain.com is a domain registry website with only a PR 7. MyDomain.com is another domain registry site with only a PR 6.

Doing a search for domain.com on Google returns 86+ million results. CNet with only 27 million references has a PR 9. If the numerous citations without an active hyperlink did pass PR one might assume that those sites should be at least a PR10. Obviously just a citation without an active link doesn't pass PR.
Reply With Quote
  #49 (permalink)  
Old 08-31-2009, 10:50 AM
crankydave's Avatar
Moderator
WebProWorld Moderator
 
Join Date: Aug 2004
Location: Playing with fire!
Posts: 4,254
crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9crankydave RepRank 9
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by TheWebDoctor(tm) View Post
Google's PageRank patents clearly describe the need for an active link.

United States Patent: 6285999
and
United States Patent: 7058628
I guess you and I are reading these patents differently.

In neither of the documents (referenced docs excluded) do these patents define "link", "linked", or "linking", as "active", "clickable", or even hyperlinked.

They do make references to "citations", "randomly jump to..", "...pointed to by a link...", etc., etc., etc.

I can see where the assumption might be made that they are referring to "clickable" links. But, a URL need not be hyperlinked to be considered a citation nor does it need be clickable to be followed.

Now it very well may be the case that a URL need be hyperlinked" in order to "count" but I've not tested it nor do I see that to be definitively evidenced by the patents cited.

Unless I'm missing something.

Dave

Last edited by crankydave; 08-31-2009 at 10:53 AM.
Reply With Quote
  #50 (permalink)  
Old 08-31-2009, 10:52 AM
Terry Van Horne's Avatar
WebProWorld Veteran
 
Join Date: Apr 2008
Location: Toronto On., Ca.
Posts: 471
Terry Van Horne RepRank 4Terry Van Horne RepRank 4Terry Van Horne RepRank 4Terry Van Horne RepRank 4
Default Re: Robots.txt can pass PageRank?

Quote:
Originally Posted by TheWebDoctor(tm) View Post
The use of hyperlinks to determine the importance of a web page has existed prior to Google's patent. Google cites, in their patent or should I say Stanford's patents, that hyperlinks are the mechanism they use to determine if a citation is to be used for determining the relevance to a search query based upon the anchor text used.
No one has said they do infact I was careful to make sure that my posts were clear in that Google findfs them and as CD has added likely use them for discovery like they do in gmail and other applications.

Quote:
Originally Posted by TheWebDoctor(tm) View Post
What Google did and still does is determine their opinion of a page and then represent their opinion in a numerical value based upon a mathematical equation they created. Again, since a hyperlink is required and a robots.txt file cannot contain a hyperlink it cannot pass PR.
So you don't think that google hasn't changed that in 10 years... interesting the most innovative technology on the ineternet has stood still for 10 years on the most important part of the algo... that's really good to know... how long is it you've worked at google because that's in the territory of "google engineer.
Quote:
Originally Posted by TheWebDoctor(tm) View Post
Doing a search for domain.com on Google returns 86+ million results. CNet with only 27 million references has a PR 9. If the numerous citations without an active hyperlink did pass PR one might assume that those sites should be at least a PR10. Obviously just a citation without an active link doesn't pass PR.
Thanks that indicates text citations are less effective than hyperlinks... again only a google engineer knows that for sure... I wish to keep an open mind and be receptive to all possibilities... your example is good and I thank you for sharing.

However, Robots.txt is not a protocal it's a BS SE document first semi supported by webcrawler... that would be the SE that initially tracked links but didn't use them extensively in the algo. All SE's and crawlers can choose to interpret Robots.txt so if someone wants to hyperlink from it they can whether it would work is not determined. A 302 shouldn't allow people to Jack other sites results but it still happens so... I take nothing for granted especially "non protocal" protocals with no governing body and which SE's interpret as they please.... when it's a real protocal we'll talk about what someone can put in it or do with it.
__________________
Follow me on Twitter! On the Trail with SOSG How I became a Social Media Convert and Twitter and Agents of Influence and now regular poster at Cloudmixer where We're Mixing New Media Ideas.
Reply With Quote
Reply

  WebProWorld > Search Engines > Google Discussion Forum

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
robots.txt vs robots meta tag Gert Leroy Search Engine Optimization Forum 14 07-29-2009 07:48 AM
Robots.txt & PageRank Webnauts Google Discussion Forum 22 06-27-2009 06:49 PM
PageRank (PR) for Robots.txt? Webnauts Google Discussion Forum 47 08-27-2007 01:18 PM
Robots meta tags or Robots.txt? Webnauts Search Engine Optimization Forum 0 08-16-2007 01:03 AM
Toolbar Pagerank + Live Pagerank dwirken Google Discussion Forum 1 02-21-2006 05:55 PM


All times are GMT -4. The time now is 01:08 PM.



Search Engine Optimization by vBSEO 3.3.0