Submit Your Article Forum Rules

Results 1 to 5 of 5

Thread: Trying to avoid duplicate content, but Google being to smart!

  1. #1

    Thumbs up Trying to avoid duplicate content, but Google being to smart!

    Hello all,

    We have a website that allows someone to find directions in Australia Find driving directions in Australia clickfind™

    The look of the website is very similar (or just about the same) to our main website Australian Search Engine and Business Directory, find businesses in Australia clickfind™ we also want members to be able to use both domains while signed in etc. because of this; we point the website to the same content directory on our webserver, but have changed some of the links to point back to Australian Search Engine and Business Directory, find businesses in Australia clickfind™ to avoid Google indexing the duplicate content.

    Example: /about-clickfind.cfm exists on About clickfind™ the Australian Business Directory and Search Engine and About clickfind™ the Australian Business Directory and Search Engine because they are located in the same directory on the server, but we only want About clickfind™ the Australian Business Directory and Search Engine indexed in Google.

    If you look at Find driving directions in Australia clickfind™ at the bottom you'll notice the "about us" link is pointing to About clickfind™ the Australian Business Directory and Search Engine but somehow Google has gotten hold of About clickfind™ the Australian Business Directory and Search Engine and is now indexing all the pages on it. I think it might have done this by itself, i.e. taking a page from our main site and just checked if it was available on the Find driving directions in Australia clickfind™ domain as I don't see how else it got to it.

    Is there anyway to avoid any pages on our second site from being indexed? Remember, we can't use a botos.txt file as that would be shared by any other domain we run from the same content directory (which is quite a few).
    The only way I just thought of while writing this, is to make the robots.txt file dynamic and change its contents based upon the domain name its being requested from. But means I would have to mess with the extensions in the programming language we use, not looking forward to doing that.

    Anyone else got any other great ideas on this?
    clickfind the new Australian Business Directory... but, just a little different.
    www.clickfind.com.au

  2. #2

    Re: Trying to avoid duplicate content, but Google being to smart!

    I think I might have the answer: we're using a custom 404 error, if we just check whether the requested page is robots.txt, then check the domain its requested from, and return the proper content based upon that.... hmm, that might work...
    clickfind the new Australian Business Directory... but, just a little different.
    www.clickfind.com.au

  3. #3
    Senior Member
    Join Date
    Jul 2004
    Posts
    884

    Re: Trying to avoid duplicate content, but Google being to smart!

    Robots.txt is per (sub)domain basis so there shouldn't be a problem. Every subdomain can have its own robots.txt file.
    You can also setup two virtual hosts with different DocumentRoot directives and from there rewrite/redirect everything except robots.txt to the other site/directory.
    Impossible? You just underestimate the time.

  4. #4
    WebProWorld MVP Webnauts's Avatar
    Join Date
    Aug 2003
    Location
    European Community
    Posts
    8,934

    Re: Trying to avoid duplicate content, but Google being to smart!

    John S. Britsios, Forensic SEO & Social Semantic Web Consultant | My personal blog Algohunters

  5. #5
    WebProWorld MVP incrediblehelp's Avatar
    Join Date
    Jan 2004
    Posts
    7,463

    Re: Trying to avoid duplicate content, but Google being to smart!

    Quote Originally Posted by business-directory View Post
    Is there anyway to avoid any pages on our second site from being indexed?
    Well linking to the right URL would be a good start. If you dont want the duplicate About Us page indexed, stop linking to it and only link to the original.

Similar Threads

  1. How to avoid legitimate duplicate content penulty?
    By raj7 in forum Search Engine Optimization Forum
    Replies: 1
    Last Post: 09-04-2008, 07:16 AM
  2. Google: Subdomain duplicate Content
    By kruser in forum Google Discussion Forum
    Replies: 3
    Last Post: 09-04-2007, 07:00 PM
  3. How to avoid duplicate page penalty?
    By burian_vitalie in forum Google Discussion Forum
    Replies: 5
    Last Post: 01-12-2006, 11:06 AM
  4. How to avoid duplicate content filter?
    By dougadam in forum Google Discussion Forum
    Replies: 2
    Last Post: 08-13-2005, 08:39 PM
  5. Google and Duplicate Content
    By cspelts in forum Google Discussion Forum
    Replies: 6
    Last Post: 03-18-2005, 08:14 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •