WebProWorld Part of WebProNews.com
Page One Link To Us Edit Profile Private Messages Archives FAQ RSS Feeds  
 

Go Back   WebProWorld > Search Engines > Google Discussion Forum
Subscribe to the Newsletter FREE!


Register FAQ Members List Calendar Arcade Chatbox Mark Forums Read

Google Discussion Forum Google Discussion forum is for topics specifically related to Google. There is a subforum dedicated to AdSense/AdWords subjects.

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 05-24-2006, 07:57 PM
WebProWorld Pro
 

Join Date: Oct 2004
Posts: 117
stretch dog RepRank 0
Default Help with potential duplicate content issue...

I've been watching a site that's been all over the map so to speak (in terms of indexing) for the past few months while google's been... shall we say, experiencing growing pains... lol.

Today another poster brought a potential duplicate content issue to my attention... http://www.webproworld.com/viewtopic...=302170#302170, and suggested I should split this topic off to a thread of its own, so here it is.

Back Ground

I've posted details about my experience with www.performbettergolf.com elsewhere in this forum so will not bore you once again here. If interested, details and rants, as well my comments about the infamous "Matt Cutts" Indexing Timeline) blog post, can be viewed at:as well as Google Groups and Matt Cutts Indexing Timeline... but of course my long, detailed and well thought out comment was deleted for some strange reason, as was a follow-up enquiry in regards to the missing post.

Like, whats with that... is it possible that maybe Matt doesn't like me and is the reason the site is doing so badly... lol.

Potential Duplicate Content Issue

Quote:
Originally Posted by crankydave
Out of curiosity have you checked this...

http://www.google.com/search?q=allin...rgolf.com.&hl= en&lr=&filter=0

Not sure but I see a possible duplicate issue with the "about listing". Perhaps someone else can comment on this."
Thanks Dave, and yes if anyone can comment on this I would appreciate the feedback. I have seen this page before and if you substitute the www.performbettergolf.com (part of the url) with any other domain url you get a similar result... so didn't think it was an issue... but when i checked today after Dave brought it to my attention, I quickly realized that when i substituted other urls of my own, the pages generated could not actually be found in googles index... while the performbettergolf.com version of the "about listing" is in fact indexed by google and is a full and complete duplicate of the home page (and site) in question that has been dumped by google.

Are these "about listing" pages supposed to be indexed, and is there a potential duplicate content issue, or did google or someone else screw up.... i am not clear why it would be... it is simply a "listing" of our site is it not?

If anyone can help, feel free please to comment. It is beyond me why the site suddenly (several weeks back) lost the home page and all main root pages and most article and product pages "gone" and unsearchable... while the blog directory main page continues to be cached regularly!

Follow-up...
Taking a closer look at this, it appears this page is simply presenting the www.performbettergolf.com home page within a "frame", and when i look at the source code there is no dupe content to be found.

So, my question to all you more experienced programs, webmasters and search engine "experts" is this... "Am i safe to assume that this page coming up in the "allinurl:" command search does NOT then indicate a problem?"

Quote:
Originally Posted by crankydave
I went back and looked at the source code and I wonder if this is somehow being filtered as a doorway or mirror page? If so, might be causing your problems.
Please, can anyone substantiate this or rest my mind that this "about listing" is not an issue for the site.

In fact, i am listening for any input as to problems with the site that may be a factor in why its been disrespected so by the big G!

The site does have some issues i know, such as some of the articles having been syndicated etc, the about "golf trainer" page is about the owner and is used on two of his sites... but certainly nothing to warrant being pulled.

All in all, the site is well established, user friendly, the code is "busy" but close to validation, the site has lots of good "inbound links (both contextual as well as solid link exchanges with recognized authorities), has plenty of unique pages of content, and it used to have a PR of 5 throughout the entire domain.

Ironically, the home page is the only page showing any PR for almost 2 months now (still a 5)... meanwhile the home page hasn't been searchable in google's index for the same period of time.

Is it possible we have been sandboxed? Or are we simply "collateral damage"... ?

Any input on this matter, and the site, would be greatly appreciated... ty.

SD... ;-)~~~~~~~~~~~~
__________________
WebFoot Creative - Website Design, Marketing and SEO.
Debt Help USA | Bankruptcy USA - For Help with Debt and Bankruptcy.
Reply With Quote
  #2 (permalink)  
Old 05-25-2006, 09:56 AM
crankydave's Avatar
Moderator
WebProWorld Moderator
 

Join Date: Aug 2004
Location: Playing with fire!
Posts: 2,887
crankydave RepRank 4crankydave RepRank 4crankydave RepRank 4
Default

SD

I took some unique strings of text from your home page and searched it in "quotes".

One site kept coming up and it wasn't yours. Some sort of scaper using frames. Not just some of your content, all of it. Not sure where or how Google is attributing the text. It's not in the source that I can see.

On the "live" DC right now a search for your domain brings up nothing whereas yesterday it brought up all the pages that "contain" the term.

Dave
Reply With Quote
  #3 (permalink)  
Old 05-25-2006, 03:00 PM
WebProWorld Pro
 

Join Date: Oct 2004
Posts: 117
stretch dog RepRank 0
Default

Yes, I have searched unique content in quotes many times and not always the same scrapers sites come up.

I've just checked now for example, and the first two blocks of text brought up single site results... two different scrapers.
  • The third block of text brought up nothing at all.

    The fourth block of text I searched was "Mike Pedersen - Power Golf Swing Trainer and Golf Fitness Expert"... which brought up four results... the first of which was in fact www.performbettergolf.com with a January cache.

First off, can anyone explain why Google didn't bring up the home page for the first three blocks of text searched, but did for the fourth... obviously the page is in their index, so it should have come up for all four searches, should it not?

Second, what the bloody heck is up with Google prefering to serve searchers these God aweful scraper sites over real sites. Scraper crap is offensive and for the most part an insult to the searcher. These sites should be so far down the list as to make them useless for organic search, which might help to make them less attractive to marketers.

Third, how are these abusive scraper sites serving "our content" to the search engines, cause it sure aint on the page i see when i go there, nor can i find it when i search the source code?

Is it some sort of cloaking or programming that is doing it... and if so, how come Google doesn't whack them silly for breaking their webmaster guidelines?

I did a number of searches as well, for unique content in quotes from some of the product pages that have been dropped from the index, and I get the occassional affiliate site that has taken bits and pieces of the "pitch" and posted to their own pages. But what i see is only a line here and there, mixed in with whatever else they are selling on their page, so I don't see these instances as a potential for "dupe content penalty" or such.

Lastly, the "allinurl:" must be wonky like the "site:" search... cause when you use "allinurl:www.performbettergolf.com" you get 5 results, the first of which is the "about listing" in question. If you search "allinurl:performbettergolf.com" you get 7 results, the first of which is the home page of the site that can't be searched.

Anyone else care to comment?

SD
__________________
WebFoot Creative - Website Design, Marketing and SEO.
Debt Help USA | Bankruptcy USA - For Help with Debt and Bankruptcy.
Reply With Quote
  #4 (permalink)  
Old 05-28-2006, 03:07 PM
incrediblehelp's Avatar
Moderator
WebProWorld Moderator
 

Join Date: Jan 2004
Location: Live in Cincy Now
Posts: 7,597
incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4
Default

Well to answer most of your questions on why Google results suck sometimes it is simple. All most all results algorithmically based. Meaning Google doesn't know the difference in a crappy scrapper website to a high quality website like your own. It is is being fooled into thinking the scrapper is of higher quality than your website. This could be done in many different ways. Maybe these website is being clocked like you asked. Sorry but contrary to popular beliefs cloaking still works fine in Google. Maybe the scrapper website has better or higher quality back links than your own. Maybe the scrapper website was published around the same time your website was published and Google cant distinguish between which one is originally and which is a copy. Maybe the scrapper website domain is much older and is trusted more than your own.

Sure looking at these cases one by one it is frustrating for not only you, me and all the worlds regular searchers, but Google doesn't look at it this way. They simply figure the changes they make on a global basis is what works and as we can see it doesn't for many of us out there.

Now as for the search operators you are using these seem to be totally inaccurate and would not put stock in the results they spit out.
Reply With Quote
  #5 (permalink)  
Old 05-28-2006, 03:39 PM
incrediblehelp's Avatar
Moderator
WebProWorld Moderator
 

Join Date: Jan 2004
Location: Live in Cincy Now
Posts: 7,597
incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4incrediblehelp RepRank 4
Default

Also why not 301 redirect performbettergolf.com to http://www.performbettergolf.com/?

When I search Google, I see the domain indexed without the www.
Reply With Quote
  #6 (permalink)  
Old 05-29-2006, 03:25 PM
WebProWorld Pro
 

Join Date: Oct 2004
Posts: 117
stretch dog RepRank 0
Default

Interesting... that is the Jan 29th cache of the hope page. Before they dropped it back mid april, it was being cached current within 5 to 10 days, was grabbing updates and new pages we published etc.

Your search is performbettergolf, and the page comes up a few from top out of more than 15,000 results (everyone of which is in fact relevant to the site in question)... meanwhile if you search performbettergolf.com, you get nothing.... so it has the jan 29th cache of the page for the term "performbettergolf" but not for "performbettergolf.com"... which is just plain screwed up... i mean either the page is there or the page isn't there... right?

As for the redirect... the site owner gave me access to his godaddy interface and i tried to do the forward with what godaddy had available there... the site went down completely within hours (completely blank, no 404, nothing) so I promptly reversed what i had done.

I set up an htaccess with redirect and emailed it to them.... they said they didn't do "custom code". It was just a straight froward 301 i sent them.

The site owner has since talked to them on more than one occassion about having them do a 301 redirect of performbettergolf.com to http://www.performbettergolf.com/ and the guys at godaddy responded with emails that demonstrate they either don't know what a 301 redirect is... lol.

Or they simply do not want to be bothered with it.... either way i wasn't very impressed with something that is so simple, and unfortunately its simply out of my control.

Having said this... before the site was dropped by google, it was not having "www/non www" canonical issues... only the www version of the pages were being cached and all non www pages were in fact showing the www version equivelent page cache.

Lots to think about, thanks for the input... perhaps if the problems we are experiencing is in fact a result of issues with the site itself... maybe it is a combination of a bunch of little things that has simply put it "over the brink" of googles algorithms.... either way, a shame because its a good site that now cannot be found even for "long tail" terms in google.

SD
__________________
WebFoot Creative - Website Design, Marketing and SEO.
Debt Help USA | Bankruptcy USA - For Help with Debt and Bankruptcy.
Reply With Quote
Reply

  WebProWorld > Search Engines > Google Discussion Forum
Tags: , , , ,



Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Search Engine Optimization by vBSEO 3.2.0