View Full Version : Search Engines Crawling HTML Forms?
cppgenius
02-10-2007, 06:26 PM
Do search engines crawl HTML forms. I know most search engines have a problem with dynamic content and an HTML form usually contains dynamic content.
I have created a set of pages that have to be visited in a specific sequence. If you visit a specific page in the sequence directly it takes you to a disclaimer page forcing the visitor to follow the pre-defined sequence. The starting page is at http://www.cybertopcops.com/online-threat-simulations.php
I use the following redirecting PHP code in the pages to redirect each page back to the disclaimer page if it was referred from the wrong location.
header("Location: online-threat-simulations.php");
This will cause a 302 (moved temporarily) redirect right? I placed the navigation system in a form on each page in the hope that search engines will not follow these pages and not end up in a loop going from each page back to the disclaimer page. I also added the NOFOLLOW, NOINDEX tag to each of the internal pages of the sequence, but that doesn't really help because the redirect is executed before these tags are read by the spiders.
Is there a better and safe way of doing this?
corporateface
02-12-2007, 05:27 PM
I believe so,
I have had search results from information that resides solely in drop down boxes for example.
craigmn3
02-12-2007, 05:47 PM
UNLESS you are pointing your form server to a cgi (or other) script...the hyperlinks in your forms are code. And Codes is Codes readable by the search engine.....
What I am perceiving you want to do is to not allow anyone in your site unless they agree to your policy. You achieve this by using a simple enough script.
Have the customer enter a code to proceed....and that code will be the title of the webpage they will go to.
IE
the code is 123456
The webpage is www.mysite.com/123456.html
that way there is no path or even links in code for spiders to follow. The coding is pretty simple.
you can see it action at http://www.legacypoolsusa.com/dealerportal.html
this is the SIMPLEST of ways to restrict navigation I know of.
corporateface
02-12-2007, 05:52 PM
PS ...
Just a mention on layout...
Move your google ad out of the top.
It looks like it is your main navigation.
It was the first thing that I clicked, and I left your site of course and went to a list of other websites in Google.
FYI
After the redirect header, add an echo line with a reason for the redirect and a link to the page the user is being redirected to (in case their browser doesn't do automatic redirects, then an exit line. This will prevent the rest of the page from being displayed to the spider.
Matt Kelly
02-12-2007, 06:39 PM
I agree with corporateface and did the same thing...
lunartcorp
02-13-2007, 09:25 AM
PS ...
Move your google ad out of the top.
It looks like it is your main navigation.
This is a trick to make users to believe that it is part of the main navigation and then provoke them to click on it. It works only in useless/clickable websites, where webmasters do not care sending the visitors out to google ads, which represent other websites. If this site is going to have good content and the intention is to retain the visitors, the Google ads should be removed from the top.
cppgenius
02-13-2007, 04:06 PM
Firstly I want to thank you all for the diplomatic way you showed me that I need to remove the link unit from the top. I realised that this is confusing to my visitors and you might classify it as an unethical way to generate ad revenue. Because Cyber Top Cops delivers its services free of charge I need all the ad revenue I can get to keep the site running, but you made me realise that I should not do it at the cost of my visitors. Too bad I had to get a wakeup call from other people before realising it myself, but I guess it is better to receive the wakeup call from your fellow forum members than your visitors.
wige, I am under the impression that the redirect done by the PHP script will not allow your browser to parse the document beyond that line. The redirect is done on server side, so I don't have to worry about browsers not supporting this (client side processing is not a factor here, it is only a meta refresh tag that is not supported by older browsers).
The only thing I'm afraid of is causing the spider to go into a loop and that will most definitely cause some damage to my rankings. However at the time of making the post I never thought of adding a NOFOLLOW tag to my disclaimer page. So I added the tag and now I no longer have to worry about infinite loops, because the spider will not follow the links on the disclaimer page, so crawling is supposed to stop at this page.
craigmn3 the method you explained makes a lot of sense at I will certainly worth giving a try.
The navigation is sorted and does what I want it to do and I have added the NOFOLLOW tag to the disclaimer page and NOINDEX,NOFOLLOW to each page in the sequence (for in case the spiders crawl through the sequence), so it is now clear to the SE spiders what to do with these pages. Do you think it is safe to assume that this will not have a negative effect on my Search Engine rankings?
wige, I am under the impression that the redirect done by the PHP script will not allow your browser to parse the document beyond that line. The redirect is done on server side, so I don't have to worry about browsers not supporting this (client side processing is not a factor here, it is only a meta refresh tag that is not supported by older browsers).
The only thing I'm afraid of is causing the spider to go into a loop and that will most definitely cause some damage to my rankings. However at the time of making the post I never thought of adding a NOFOLLOW tag to my disclaimer page. So I added the tag and now I no longer have to worry about infinite loops, because the spider will not follow the links on the disclaimer page, so crawling is supposed to stop at this page.
In PHP, the header() directive only adds a line of text to the header portion of the server's message to the browser. The redirect is not done on the server side, but is done by the browser. Text browsers do not obey the Location: wherever header command. Other specialized browsers also do not always obey this command. The full content of the page is sent, as your PHP script will still execute normally.
cppgenius
02-14-2007, 03:19 PM
wige, now I see what you mean, adding the echo line (giving the reason for redirecting to the user) followed by an exit line will prevent the rest of the page being shown and should the browser not support this feature, it will still force the user to go through my disclaimer page. Great stuff thanks!