PDA

View Full Version : Canonical Domain Question



PhilipDunn
10-01-2011, 01:17 PM
I've read many threads about the importance of making sure that PR flows to a canonical URL, and to ensure that you are not dividing it among URL's. My question is, exactly how do you do that?
I now have a site that has an index.htm and an index.html. The main Domain, blank.com, defaults to the index.html. But if you click on the home page tab, you are taken to blank.com/index.htm

How is the PR being handled?

I think I've heard that blank.com and blank.com/index.whatever are different pages to search engines.

I've made a few changes already, but would like opinions on how to handle this..

thanks,

weegillis
10-01-2011, 01:56 PM
Canonical tags are intended to funnel multiple dynamic URL's to a single, fixed URL. Say for instance you have several hundred dynamic pages all leading from /directory/?id=xxxx. These URL's all have /directory/index.php as a common root page, so they are splitting the PR over hundreds of URL's, rather than the one.

In the case of static pages, Canonical tags are of little use. For this, I would recommend server redirects to funnel all possible configurations of the host URL down to one, www.example.org (http://www.example.org)/.

PhilipDunn
10-01-2011, 05:31 PM
thanks for clearing this up weegillis. so would you be saying redirect to the index pages to the main domain, blank.com?

deepsand
10-01-2011, 05:43 PM
Why two index pages?

The ideal solution is to get rid of one.

PhilipDunn
10-01-2011, 05:48 PM
That is actually what I did. I don't know why there were two created. I was just wondering if that is going to affect rankings, since the home page has been using the index.html page for a few years. Both pages, .html and .htm are identical. Now the .htm page is being used as the index.

weegillis
10-01-2011, 05:53 PM
The redirect would be applied only on the host domain, and its index.xxxx page:

//../;
//../index.xxxx;
//www. .. /index.xxxx

would all be redirected to http://www. .. /

As for placing of Canonical tags. they would only be applied to the root page of the dynamic catalogue.

Eg. This page is always served up for all requests, regardless the query string. In order to keep the dynamic pages from leaking PR, and more importantly to prevent indexing multiple pages of the same thing except a line or two, the tag is added:



<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Canonical tag</title>
..
<link href="/css/webform.css" rel="stylesheet">
<link href="http://www.example.org/courses/register.php" rel="canonical">
<link href="/favicon.ico" rel="icon" type="image/vnd.microsoft.icon">
<link href="/favicon.png" rel="icon" type="image/png">
..
</head>

PhilipDunn
10-01-2011, 06:02 PM
thanks weeg...

weegillis
10-01-2011, 06:40 PM
That is actually what I did. I don't know why there were two created. I was just wondering if that is going to affect rankings, since the home page has been using the index.html page for a few years. Both pages, .html and .htm are identical. Now the .htm page is being used as the index.

If HTML page has been used for years, then it is probably in the index. Switching to HTM now, even if it does exist, would not be a good idea, especially if it is NOT in the index. Canonicalization of your site's root, and of folder roots will clear up this inconsistency, anyway, as the folder root, regardless of extension, will always open. Just be sure that you don't have multiple index pages with different extensions, or the server will always serve up index.html or .htm first, and not .php, etc.

Redirecting into a funnel all possible URL's for a single resource is the ideal way to ensure that everyone sees the identical content, and all the PR channels into the one URI. This is more or less what is meant by the term, canonicalization. In normal circumstances it is applied through server redirects. In special instances, such when you don't to want deny access to an actual URL, the canonical tag may be used. It should be noted, THIS IS NOT A REDIRECT.

PhilipDunn
10-01-2011, 07:00 PM
I appreciate all your efforts here Weegillis,

I guess this takes me back to my original question, how do you canonicalize your site's root? When there are two index pages in the main directory and it defaults to the index.html page over the index.htm.

On another note, it appears to no longer be an issue for this site, as I've just discovered that index.html isn't even in the index. Only the index.htm can be found in google.

deepsand
10-01-2011, 07:58 PM
In the <head> of the page that you do not want indexed under its own name, put the following:


<link rel="canonical" href="Desired_URL_Here" />

For example, if you wanted both instances of your index pages to be treated as index.htm, then put this into the <head> of index.html:


<link rel="canonical" href="http://www.Your_Domain_Name.TLD/index.htm" />

The same can be used to force variations with and without the "www" prefix to be indexed according to your preference.

weegillis
10-01-2011, 09:35 PM
I appreciate all your efforts here Weegillis,

I guess this takes me back to my original question, how do you canonicalize your site's root? When there are two index pages in the main directory and it defaults to the index.html page over the index.htm.

On another note, it appears to no longer be an issue for this site, as I've just discovered that index.html isn't even in the index. Only the index.htm can be found in google.

In his earlier post, deepsand asked, 'why two index pages?' I guess the question could be re-asked as, "how did this come to be?" However, the answers are now moot, as you've discovered which one can be done away with. But... be sure your server redirects are working correctly, just in case you've got inbounds on the .html page.

Now just set the redirect for the root, and don't name the index page, at all. Then it won't matter which extension you use.

If you have dynamic url's that you want in the index, do not apply the canonical tag to the root (script) page, else they will never get indexed. Followed, yes, indexed, no. Remember, the tag is not a re-direct, and nor is it a robots tag. It is not universally supported. Google honours it, and I suspect the major SE's do, also. But it is not a standard, by any means.

weegillis
10-01-2011, 09:49 PM
In the <head> of the page that you do not want indexed under its own name, put the following:


<link rel="canonical" href="Desired_URL_Here" />

For example, if you wanted both instances of your index pages to be treated as index.htm, then put this into the <head> of index.html:


<link rel="canonical" href="http://www.Your_Domain_Name.TLD/index.htm" />

The same can be used to force variations with and without the "www" prefix to be indexed according to your preference.

Glad you added this, @deepsand. It's one aspect that I failed to mention.

PhilipDunn
10-01-2011, 10:19 PM
excellent, thanks. that is the code i was looking for Deep...

I have no idea why there are pages with both extensions, and not just on the index. but another great point, Weegi. I've added the redirects to pick up any IBL's and probably a few bookmarks too..

SinghVineet
10-03-2011, 03:15 AM
I will suggest to use 301 redirect over rel=canonical tag as suggested by Matt Cutts in the following video:

http://www.youtube.com/watch?v=zW5UL3lzBOA

PhilipDunn
10-03-2011, 11:47 AM
I did use the 301 redirect, though I couldn't figure out how to get it to work without naming the page. Good video...

deepsand
10-03-2011, 03:45 PM
I will suggest to use 301 redirect over rel=canonical tag as suggested by Matt Cutts in the following video
The OP's matter was not the classic case of a moved/renamed page, but one of duplicate pages with different extensions.

monstercoder
10-03-2011, 06:01 PM
We have this in all of our .htaccess files for all of our sites, but it's only for the home page:

#### Make Index.html and php go to / ####
RewriteCond %{THE_REQUEST} ^.*/index.html
RewriteRule ^(.*)index.html$ http://www.your-domain-goes-here.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^.*/index.php
RewriteRule ^(.*)index.php$ http://www.your-domain-goes-here.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^.*/index.htm
RewriteRule ^(.*)index.htm$ http://www.your-domain-goes-here.com/$1 [R=301,L]