Wanted to do a quick summary of the mistakes I've made over the last few months relating to duplicate content.
Here's my hit list of things to avoid.
1. URL confusion. Mixture of yourdomain.com and
www.yourdomain.com.
2. On internal links mixture of
www.yourdomain.com/anypage and .../anypage.
3. When using session ids (e.g. for member login page) links can be generated with session id as suffix. This can, and has in my case, result in two copies of linked page in index.
4. Old, pre-launch copies of your site. In my case this is a site sitting on some free web space. No longer indexed by Yahoo!, Google still index it.
The problems in my view affext like this,
1. Duplicate content penalty.
2. Poor quality signal.
3. Duplicate content penalty.
4. Duplicate content penalty.
If you have all these issues as I have/had then you really need to get them sorted because you're going to be treated as a spammer.
BTW anyone got an answer for me on no. 4. I no longer have access to ftp or webspace for free hosted pages, so can't do a 301 redirect. Would love to get this old test copy out of Google's index?
pne