Submit Your Article Forum Rules

Page 1 of 3 123 LastLast
Results 1 to 10 of 22

Thread: How to stop Google from crawling secure content/directory?

  1. #1
    Member
    Join Date
    Mar 2007
    Posts
    61

    Post How to stop Google from crawling secure content/directory?

    My post title is clearly stating about my question . But lets repeat it. How can we stop google bots to crawl through a secure (not-willing-to-share) information from one of the website's directories.

    Can we use robots.txt for this or you may suggest any other better treatment for this?

    If Google crawls any secure information from a website content and shows it in seacrh engines then what can we do to let this information disappeared from Google search results?

    Your co-operation is appreciated in adavance.

  2. #2
    Senior Member thindenim's Avatar
    Join Date
    Jan 2007
    Posts
    269

    Re: How to stop Google from crawling secure content/directory?

    Add the following to your robots.txt file, where directory is the area you want to disallow. This tells the robot they can't access or index this section.

    User-agent: *
    Disallow: /directory/*

    Alternatively you can use the meta noindex tag on each page: -

    <meta name="robots" content="noindex, nofollow">

    which will mean that the page is not indexed and links not followed, or: -

    <meta name="robots" content="noindex, follow">

    which will mean that the page is not indexed, but any links are followed

  3. #3
    Senior Member fernimac's Avatar
    Join Date
    Jul 2003
    Posts
    160

    Re: How to stop Google from crawling secure content/directory?

    The previous post has been very clear. Although I would add that if your pages are in a secure area, Google should not be able to get to those pages. Even if you include a noindex no follow directive, users could still get to those pages. You should implement a secure access via a password instead so that no SE and no undesidered users get to your private pages.


  4. #4
    Senior Member thindenim's Avatar
    Join Date
    Jan 2007
    Posts
    269

    Re: How to stop Google from crawling secure content/directory?

    Good point fernimac, you should always password protect sensitive information.

  5. #5
    Member
    Join Date
    Oct 2007
    Posts
    52

    Re: How to stop Google from crawling secure content/directory?

    Hi

    All good advice has gone before me.

    In case you are taking the robots.txt route, best to use the meta tag as well. If a link somehow exists to a page, or is created by accident, that could still be crawled regardless of robots.txt.

  6. #6
    WebProWorld MVP Peter (IMC)'s Avatar
    Join Date
    Dec 2003
    Posts
    1,483

    Re: How to stop Google from crawling secure content/directory?

    This is one of those phylosofical questions. Googlebot is nothing more than a normal visitor who, when asked, will refer others to those pages.

    If you allow visitors to that part of the site without the need to login, how are you going to prevent them from refering their friends to that part of the site?

    If you really want "secure (not-willing-to-share) information" to be available to only those that you choose, you need to password protect it.

    robots.txt or other "just for the search engines" kind of ways aren't the way to go because they aren't meant to block "not allowed" visitors.
    FREE SEO ! Really? YES! All you have to do is implement it!
    Follow me on Twitter PeterIMC

  7. #7
    Senior Member
    Join Date
    Dec 2007
    Posts
    212

    Re: How to stop Google from crawling secure content/directory?

    robots.txt and meta tags are not appropriate ways to protect confidential information. Confidential pages should be password protected.

    robots.txt and meta tags are only meant to pass information to well-intended robots and search engines. Some ill-intended bots will use it to detect potential weaknesses in your web site.

    If your private pages are already in Google, visit How can I prevent my own content from being indexed or remove content from Google's index? (see the part about expedite removal).

    Jean-Luc
    200ok.eu Broken Link Checker finds 404 errors, error pages with 200 ok status, missing images, protocol errors, password protected pages, bad domain names, redirect loops, parking pages, ...

  8. #8
    Member
    Join Date
    Dec 2006
    Posts
    36

    Re: How to stop Google from crawling secure content/directory?

    Google provides definitive instructions for: (1) removing sensitive pages from Google's search results, and (2) preventing pages from being indexed by them in the first place:

    Preventing content from appearing in Google search results

    Note that, with the exception of Google, Yahoo!, MSN, and Ask, you should expect both your meta tags and requests for page removals to essentially be ignored. Worse, many of these "rogues" are based overseas where you'll have little or no legal recourse in the event of a problem.

    Relatedly, don't assume that your private pages are "safe" from being Google-indexed because you have no links pointing to them or you haven't submitted them. Search engines have many, many ways of finding new pages -- e.g. a visitor's Google toolbar auto-submitting them, a competitor submitting them for you, links from sites you don't control, etc.

    Confidential info should always, always be password protected -- or never placed online at all.
    adverlicio.us | online advertising archive

  9. #9
    WebProWorld MVP edhan's Avatar
    Join Date
    Aug 2003
    Posts
    941

    Re: How to stop Google from crawling secure content/directory?

    Yes. The best solution will be password protection for that directory. Using nofollow or noindex may stop the crawlers but humans still can access them.
    Find Out More About Renting Thai Amulets For Blessing Of Protection in Well Being & Wealth | Destiny of Fate | Exploring, Understanding & Learning The Basic Feng Shui Art Of Placement To Build Wealth & Harmony With Friends, Colleagues And Family Members In Relationships & Careers... Do you want a better lifestyle? Check it out today!

  10. #10
    WebProWorld MVP incrediblehelp's Avatar
    Join Date
    Jan 2004
    Posts
    7,567

    Re: How to stop Google from crawling secure content/directory?

    Sure using password protection is best, but you still may be able to link to certain pages with some password protection scripts. You can also disallow viewing directories through the htaccess file on Apache web servers

Page 1 of 3 123 LastLast

Similar Threads

  1. Google Now Crawling and Indexing Flash Content
    By janeth in forum Google Discussion Forum
    Replies: 36
    Last Post: 07-02-2008, 11:41 PM
  2. how to stop crawling of https:// urls from google
    By arin2u in forum Google Discussion Forum
    Replies: 11
    Last Post: 01-09-2008, 12:25 PM
  3. Google Analytics on site with secure and non secure pages?
    By joer80 in forum Google Discussion Forum
    Replies: 7
    Last Post: 12-07-2005, 12:15 AM
  4. Static HTML directory - secure shopping sites only
    By bhwebby in forum Marketing Strategies Discussion Forum
    Replies: 10
    Last Post: 11-26-2004, 12:01 AM
  5. Not crawling full page content?
    By mikeness in forum Google Discussion Forum
    Replies: 1
    Last Post: 11-05-2004, 09:02 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •