Is it smart to expect more, for $43.05 a year?
Is it smart to expect more, for $43.05 a year?
You get what you pay for.
So I just called them and they said that yes, their $43.05 shared hosting supports htaccess files. They checked a sample of the htaccess we had written and said it needed fixing. So the mistake is ours, not theirs... (It's against their policy to rewrite code supplied by the client, otherwise the guy would have fixed it for me.)
For the sake of Wige's compilation of solutions to canonicalization problems, if ever we come up with that file and it works, we'll copy it here.
Thanks to all for your help.
Hello Wige, it's me again, back from Google hell with total forgiveness of all my sins...
We put this into an htaccess file outside of the secure folder:
RewriteEngine On
RewriteCond %{SERVER_PORT} !80
RewriteRule ^(.*)$ http://www.spauno.com/$1 [R,L]
We put this into another htaccess file inside the secure folder:
RewriteEngine On
RewriteCond %{SERVER_PORT} 80
RewriteCond %{REQUEST_URI} secure
RewriteRule ^(.*)$ https://www.spauno.com/secure/$1 [R,L]
You can now navigate back and forth between secure and unsecured content without creating duplicate content and Google has reindexed the pages. So to be able to say that this works for sites in shared hosting, specifically GoDaddy Linux hosting, all we need is Wige's blessing.
Looks good to me, and it should work in most Linux setups. The only thing I would change is from
RewriteCond %{REQUEST_URI} secure
to
RewriteCond %{REQUEST_URI} ^/secure/
That should prevent issues if you have another page on the site that includes the word secure in the filename from triggering an endless loop of redirects.
The best way to learn anything, is to question everything.
WigeDev - Freelance web and software development
Does flash content creates conical issue?
Wige my apologies for responding too late. Looks like I lost the track.
Well, I added that option with the trailing slash, since it is technically different than without it.
SEO advice: url canonicalization
Also many try to get IBLs adding the trailing slash in their URLs like http://www.justanexample.com/, aiming to flow the PageRank only to their homepage. If you probably have noticed, many web directories forbid that already. Besides, if you have such IBLs, do you exclude the possibility that Google or other SE will not see that as a canonical issue?
I do exclude that possibility, because that is not how the spiders work. In order for a spider (or a web browser or any other system) to request a file, it breaks the link into two parts, the hostname (which will be either a domain name or IP address) and a request URI, which always starts with a "/". So, to give a few examples of how a spider looks at different links:
http://domain.tld/somepage.html
Spider sees
Protocol: http
Hostname: domain.tld
Request URI: /somepage.html
http://domain.tld/
Spider sees
Protocol: http
Hostname: domain.tld
Request URI: /
http://domain.tld
Spider sees
Protocol: http
Hostname: domain.tld
Request URI: /
(The request URI can NEVER be blank, and MUST ALWAYS start with a /, so if no URI is included in the URL, a slash is used by default.)
Even when the spider stores the information about the retrieved page in the index, (think giant relational database) that slash is always added, simply because the field containing the request URI can't be blank. The same happens when storing a list of links - the slash is added if not already present.
Specifying a leading slash at the beginning of the Request URI is also expected by the server. If a request reaches your server without that leading slash, the server may simply give a bad request message, or ignore the request, depending on how your server is set up.
I have also gotten indications from Google, Yahoo and MSN that their systems always add a leading slash if it is not already present, as does every malbot and spider system I have ever worked with. Even wget, which was the foundation for many spiders, automatically adds the slash. It is simply a default part of the HTTP protocol.
Regarding the link you mentioned, I take it you are referring to the following:
I have seen this used in numerous subsequent articles on canonicalization, used as the basis of an argument for taking steps to handle missing slashes. However, what he was highlighting was the absence of the subdomain in the second version, not the presence of the slash. In the rest of the article, he makes no mention of the slash at all, which leads me to believe this was only a typo. It was addressed in the comments, where Matt suggested selecting a preferred format for links, but it was not addressed beyond that.Originally Posted by Matt Cutts
Above all, it is important to remember that to a server, the requests for www.example.com and www.example.com/ both look identical (GET / HOST: www.example.com) so anything you do on the server to redirect from one to the other is pointless anyway.
The best way to learn anything, is to question everything.
WigeDev - Freelance web and software development
[QUOTE=wige;358772]
If you want to add a trailing slash / if no file name is specified (domain.com/file becomes domain.com/file/) use the following:
"domain.com/file/" does make little sense to me, because the trailing slash in a URL indicates the index of the directory (See Apache DirectoryIndex directive).Code:RedirectMatch 301 ^/([a-zA-Z0-9/]*)$ http://domain.com/$1/
The simplest use of a webserver is to point it to directory tree and let it serve the files there. Typical conventions are that you do specify a default file extension, for example .html so that domain.com/abc serves domain.com/abc.html. The second common convention is that the domain.com/edf/ shows the list of files available, unless the DirectoryIndex directive (or equivalent for non Apache) is set and the file specified is present, such as domain.com/edf/ actually serves domain/.com/edf/index.html.
To clarify when I say serve, I mean it returns thes specified content and not a redirect. This is evident by no change in the URL entry field of the browser.
In the context of this thread, this can lead to duplicate content, as for example domain.com/edf/ returns the same content as domain.com/edf/index.html (However, if no one ever links to .../index.html the search engine would never discover the URL)
By the way your redirect script would also change domain.com/abc.html into domain.com/abc.html/
K<o>
It shouldn't, but I will take another look. The pattern I am using:
RedirectMatch 301 ^/([a-zA-Z0-9/]*)$ http://domain.com/$1/
contains ([a-zA-Z0-9/]) which should only be true if the string does not contain a .. As a result, if a file extension is specified, the user should not be redirected. (The request URI must only contain those characters shown in brackets for the user to be redirected.)
This is really intended to counteract a server setting that can allow the server to respond to requests omitting the trailing slash as though the slash was there, although it does also remove a possible error condition that would need to be handled with a 404 by redirecting the user to what is most likely the most desired location.
The best way to learn anything, is to question everything.
WigeDev - Freelance web and software development