iEntry 10th Anniversary Forum Rules Search
WebProWorld
Register FAQ Calendar Mark Forums Read
Google Discussion Forum Google Discussion forum is for topics specifically related to Google. There is a subforum dedicated to AdSense/AdWords subjects.

Share Thread: & Tags

Share Thread:

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 01-02-2008, 07:37 AM
WebProWorld Member
 
Join Date: Mar 2007
Posts: 61
Ozzman RepRank 0
Post How to stop Google from crawling secure content/directory?

My post title is clearly stating about my question . But lets repeat it. How can we stop google bots to crawl through a secure (not-willing-to-share) information from one of the website's directories.

Can we use robots.txt for this or you may suggest any other better treatment for this?

If Google crawls any secure information from a website content and shows it in seacrh engines then what can we do to let this information disappeared from Google search results?

Your co-operation is appreciated in adavance.
Reply With Quote
  #2 (permalink)  
Old 01-02-2008, 08:13 AM
thindenim's Avatar
WebProWorld Pro
 
Join Date: Jan 2007
Location: Scotland
Posts: 255
thindenim RepRank 2
Default Re: How to stop Google from crawling secure content/directory?

Add the following to your robots.txt file, where directory is the area you want to disallow. This tells the robot they can't access or index this section.

User-agent: *
Disallow: /directory/*

Alternatively you can use the meta noindex tag on each page: -

<meta name="robots" content="noindex, nofollow">

which will mean that the page is not indexed and links not followed, or: -

<meta name="robots" content="noindex, follow">

which will mean that the page is not indexed, but any links are followed
__________________
Girlz Night - professional hair and beauty products
Web design glasgow - from Thin Denim
Reply With Quote
  #3 (permalink)  
Old 01-02-2008, 08:57 AM
fernimac's Avatar
WebProWorld Pro
 
Join Date: Jul 2003
Location: Alicante, Spain
Posts: 162
fernimac RepRank 1
Default Re: How to stop Google from crawling secure content/directory?

The previous post has been very clear. Although I would add that if your pages are in a secure area, Google should not be able to get to those pages. Even if you include a noindex no follow directive, users could still get to those pages. You should implement a secure access via a password instead so that no SE and no undesidered users get to your private pages.


Last edited by mjtaylor; 01-02-2008 at 07:25 PM. Reason: removing links not in sig
Reply With Quote
  #4 (permalink)  
Old 01-02-2008, 12:47 PM
thindenim's Avatar
WebProWorld Pro
 
Join Date: Jan 2007
Location: Scotland
Posts: 255
thindenim RepRank 2
Default Re: How to stop Google from crawling secure content/directory?

Good point fernimac, you should always password protect sensitive information.
__________________
Girlz Night - professional hair and beauty products
Web design glasgow - from Thin Denim
Reply With Quote
  #5 (permalink)  
Old 01-02-2008, 04:42 PM
WebProWorld Member
 
Join Date: Oct 2007
Posts: 51
Palindrome RepRank 0
Default Re: How to stop Google from crawling secure content/directory?

Hi

All good advice has gone before me.

In case you are taking the robots.txt route, best to use the meta tag as well. If a link somehow exists to a page, or is created by accident, that could still be crawled regardless of robots.txt.
Reply With Quote
  #6 (permalink)  
Old 01-02-2008, 04:51 PM
Peter (IMC)'s Avatar
WebProWorld MVP
WebProWorld MVP
 
Join Date: Dec 2003
Posts: 1,485
Peter (IMC) RepRank 4Peter (IMC) RepRank 4Peter (IMC) RepRank 4Peter (IMC) RepRank 4
Default Re: How to stop Google from crawling secure content/directory?

This is one of those phylosofical questions. Googlebot is nothing more than a normal visitor who, when asked, will refer others to those pages.

If you allow visitors to that part of the site without the need to login, how are you going to prevent them from refering their friends to that part of the site?

If you really want "secure (not-willing-to-share) information" to be available to only those that you choose, you need to password protect it.

robots.txt or other "just for the search engines" kind of ways aren't the way to go because they aren't meant to block "not allowed" visitors.
__________________
FREE SEO ! Really? YES! All you have to do is implement it!
Follow me on Twitter PeterIMC
Reply With Quote
  #7 (permalink)  
Old 01-02-2008, 04:59 PM
WebProWorld Pro
 
Join Date: Dec 2007
Location: Brussels, Belgium
Posts: 163
Jean-Luc RepRank 2
Default Re: How to stop Google from crawling secure content/directory?

robots.txt and meta tags are not appropriate ways to protect confidential information. Confidential pages should be password protected.

robots.txt and meta tags are only meant to pass information to well-intended robots and search engines. Some ill-intended bots will use it to detect potential weaknesses in your web site.

If your private pages are already in Google, visit How can I prevent my own content from being indexed or remove content from Google's index? (see the part about expedite removal).

Jean-Luc
__________________
Checking redirects made easy | | Professional AWStats Services
Reply With Quote
  #8 (permalink)  
Old 01-02-2008, 05:18 PM
WebProWorld Member
 
Join Date: Dec 2006
Posts: 36
adverlicious RepRank 0
Default Re: How to stop Google from crawling secure content/directory?

Google provides definitive instructions for: (1) removing sensitive pages from Google's search results, and (2) preventing pages from being indexed by them in the first place:

Preventing content from appearing in Google search results

Note that, with the exception of Google, Yahoo!, MSN, and Ask, you should expect both your meta tags and requests for page removals to essentially be ignored. Worse, many of these "rogues" are based overseas where you'll have little or no legal recourse in the event of a problem.

Relatedly, don't assume that your private pages are "safe" from being Google-indexed because you have no links pointing to them or you haven't submitted them. Search engines have many, many ways of finding new pages -- e.g. a visitor's Google toolbar auto-submitting them, a competitor submitting them for you, links from sites you don't control, etc.

Confidential info should always, always be password protected -- or never placed online at all.
__________________
adverlicio.us | online advertising archive
Reply With Quote
  #9 (permalink)  
Old 01-02-2008, 10:49 PM
edhan's Avatar
WebProWorld Veteran
 
Join Date: Aug 2003
Location: Singapore
Posts: 701
edhan RepRank 3edhan RepRank 3
Default Re: How to stop Google from crawling secure content/directory?

Yes. The best solution will be password protection for that directory. Using nofollow or noindex may stop the crawlers but humans still can access them.
Reply With Quote
  #10 (permalink)  
Old 01-03-2008, 09:08 AM
incrediblehelp's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Jan 2004
Location: Live in Cincy Now
Posts: 7,573
incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4
Default Re: How to stop Google from crawling secure content/directory?

Sure using password protection is best, but you still may be able to link to certain pages with some password protection scripts. You can also disallow viewing directories through the htaccess file on Apache web servers
Reply With Quote
  #11 (permalink)  
Old 01-03-2008, 09:55 AM
WebProWorld New Member
 
Join Date: Jan 2008
Posts: 1
johnehogan RepRank 0
Default Re: How to stop Google from crawling secure content/directory?

As a systems admin, I agree that using a simple .htaccess file with a corresponding .htpasswd file is the ONLY way to protect a sensitive Directory (or folder) against prying eyes. This is easy to do on Linux boxes. The folder CAN still be accessed by those who KNOW what the password is for the folder, but search engines and others will be locked out totally
Reply With Quote
  #12 (permalink)  
Old 01-03-2008, 03:16 PM
WebProWorld Member
 
Join Date: Jul 2007
Posts: 36
kurt.santo RepRank 0
Default Re: How to stop Google from crawling secure content/directory?

johnehogan,
Great input!

I worked with .htaccess, but not with .htpasswd. Would you have an entry in .htaccess naming the directory that is protected and then have in .htpasswd the password and usernames stored? And if yes, how do you do this in detail?

Thank you,
Kurt
Reply With Quote
  #13 (permalink)  
Old 01-03-2008, 07:29 PM
Fibo's Avatar
WebProWorld New Member
 
Join Date: Jan 2008
Location: Marseille, France
Posts: 4
Fibo RepRank 0
Default Re: How to stop Google from crawling secure content/directory?

Everything that you put on the web might be found by a human or robot visitor... So do NOT place any "really need to really be secure" information on the web.
The robots.txt is NOT the good solution: it just says "this is a secret area, PLEASE don't come". Well educated spiders will respect your secret, bandit spiders too -but by exploring first this advertised secret area!
So a bare minimum would be the robots.txt exclusion AND an index.htm file in the directory, that silently redirect to your homepage (no sound, no noise, do not alert the bandits), maybe with a 301 redirect.
Better is the HT protect with .htaccess and its password file (usually named .htpassword, but this name is not fixed).
Best is: don't put it on the web
Reply With Quote
  #14 (permalink)  
Old 01-03-2008, 09:32 PM
Webnauts's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Aug 2003
Location: Worldwide
Posts: 8,131
Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8
Default Re: How to stop Google from crawling secure content/directory?

I think I have a solution for you: Preventing Search Engine Indexing of Secure Pages - SEO Workers
__________________
"Being an expert isn't telling other people what you know. It's understanding what questions to ask, and flexibly applying your knowledge to the specific situation at hand. Being an expert means providing sensible, highly contextual direction." Jeff Atwood
SEO Workers - Search Engine Optimization Consulting Company | SEO Analysis Tool | Webnauts Net SEO
Reply With Quote
  #15 (permalink)  
Old 01-04-2008, 05:11 AM
WebProWorld New Member
 
Join Date: Aug 2006
Location: Scotland
Posts: 4
g3 creative RepRank 0
Default Re: How to stop Google from crawling secure content/directory?

To stop Google from crawling secure content you should always use password protection as standard.

Dave Mac

G3 Creative
Reply With Quote
  #16 (permalink)  
Old 01-04-2008, 06:25 AM
Webnauts's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Aug 2003
Location: Worldwide
Posts: 8,131
Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8Webnauts RepRank 8
Default Re: How to stop Google from crawling secure content/directory?

Quote:
Originally Posted by g3 creative View Post
To stop Google from crawling secure content you should always use password protection as standard.

Dave Mac

G3 Creative
And what is if you do not want the pages to be password protected, but still don't want those pages to be crawled?
I provided a solution above (my tutorial), but seems it have been ignored.
__________________
"Being an expert isn't telling other people what you know. It's understanding what questions to ask, and flexibly applying your knowledge to the specific situation at hand. Being an expert means providing sensible, highly contextual direction." Jeff Atwood
SEO Workers - Search Engine Optimization Consulting Company | SEO Analysis Tool | Webnauts Net SEO
Reply With Quote
  #17 (permalink)  
Old 01-06-2008, 01:10 AM
Peter (IMC)'s Avatar
WebProWorld MVP
WebProWorld MVP
 
Join Date: Dec 2003
Posts: 1,485
Peter (IMC) RepRank 4Peter (IMC) RepRank 4Peter (IMC) RepRank 4Peter (IMC) RepRank 4
Default Re: How to stop Google from crawling secure content/directory?

Quote:
but seems it have been ignored.
oh stop crying every time you don't get a standing ovation. Your posts are read and you just have to get used to the fact that most people don't fall on their knees to thank you.

You should watch the movie "the secret". Even though I think it's a dumb ass movie, they are right that if all you want to see is negative, all you will see is negative.
__________________
FREE SEO ! Really? YES! All you have to do is implement it!
Follow me on Twitter PeterIMC
Reply With Quote
Reply

  WebProWorld > Search Engines > Google Discussion Forum

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Google Crawling Nurberk Google Discussion Forum 4 12-19-2006 03:10 PM
Google Analytics on site with secure and non secure pages? joer80 Google Discussion Forum 7 12-07-2005 01:15 AM
Static HTML directory - secure shopping sites only bhwebby Marketing Strategies Discussion Forum 10 11-26-2004 01:01 AM
Not crawling full page content? mikeness Google Discussion Forum 1 11-05-2004 10:02 AM
Stop losing business to designers who offer killer content Effscot Fizz Gerald Services for Sale/Hire 0 04-17-2004 12:02 PM


All times are GMT -4. The time now is 08:27 PM.



Search Engine Optimization by vBSEO 3.3.0