PDA

View Full Version : I just had to share this



carbonize
02-19-2004, 04:09 PM
After three months without a single visit from googlebot I decided to email google and express my concerns. Here, starting with my initial email, is the conversation.

Hi,
Googlebot used to visit my site, www.carbonize.co.uk, at least once a day sometimes more. But since December I haven't had a single visit by googlebot and then when I find my site in a google search it is just the URL, no information. Have I been blacklisted for some reason?

Best regards,
Stewart Souter


Hi Stewart,

Thank you for your note. Please be assured that your site is not currently penalized by Google. As you may know, results in our index change regularly based on ongoing, automated processes aimed at improving the quality and content of our search results.

We realize these changes can be confusing. However, these processes are completely automated and not indicative of wrong-doing or penalization of individual sites. We currently include over three billion pages in our index and it is certainly our intent to represent the content of the Internet fairly and accurately. The ongoing changes you have observed are part of this effort.

While we cannot guarantee that your page will consistently appear in our index or appear with a particular rank, we do offer guidelines for building a "crawler-friendly" site. You can find these guidelines at
http://www.google.com/webmasters/guidelines.html . Following these recommendations may increase the likelihood that your site will show up consistently in Google search results.

We hope the information we have provided above is helpful to you. If this response did not adequately resolve your question, we hope that you will visit the webmaster section of our site at http://www.google.com/webmasters/ . In an effort to better address your needs, we've dedicated this entire section of our website to answering common webmaster questions, listing Google's quality requirements and recommendations, and much more.

Of course, we cannot anticipate and answer everyone's questions on our website. Realizing this, we have also created a Google discussion group on Google Groups where Google users and webmasters can connect to share their vast knowledge and experience. You can access this group at http://groups.google.com/groups?q=google.public.support.general . If you have already checked the webmaster section of our website and haven't found an answer, we encourage you to post your question to the Google Support group.

Regards,
The Google Team

To which I replied

Hi,

My site was revamped about a month ago and now is search engine friendly as it uses div's and css but this will not help my listing if your bots never visit. I just can't understand why your spiders suddenly stopped visiting my site. Three months and not a single hit from one of your spiders.

Best regards,
Stewart

And there answer?

Hi Stewart,

Thank you for your reply. While we understand your concern, please be assured that the changes you have observed are consistent with regular fluctuations in our search results. If your site complies with our guidelines, it is likely that our robots will visit it again in the near future.

Regards,
The Google Team

Do a search for carbonize on google and you will see results from forums, be it a post I have made or my profile, all of which will have a link to my site. There is also a result from my friend Freddy's site which contains numerous links to me and yet Googlebot still doesn't visit. Am I wrong to feel that I have been blacklisted?

Jakpot
02-19-2004, 06:08 PM
75% of my pages show just urls and I got the same
non responsive reply

paulhiles
02-19-2004, 08:45 PM
That is very strange Carbonize... now the number 1 slot (for carbonize, that is) is filled by a Canadian web development company. I personally felt you'd made great strides towards greater exposure.. with all the changes you'd made to your site.. a shift that I was going to emulate myself... it's all rather worrying!

There's no doubting that the first reply you received was virtually a sytem-generated auto-response that would be fired off to any query.. but the subsequent correspondence seemed to provide scant comfort!

Keep us posted as to your progress,

Paul

carbonize
02-19-2004, 09:01 PM
To make matters worse I use googles adsense. Now they claim that the adsense uses different spiders thsn the search engine but anybody visiting my site will have noticed that the adsense ads are either Mac orientated or for charity typpe things. SO it would appear that all factions of Google hate me.

I WANT MY MUMMY!!!!!

minstrel
02-20-2004, 12:34 AM
Any chance you're having server problems? I just tried to load your page in Internet Explorer 6 SP-1 FOUR times and each time it just kinda hangs. It doesn't give me an error 404 or anything - it just hangs with the loading bar at about 50% and nothing displayed in the browser...

Thinking it might be some sort of scripting thing, I tried to load http://www.carbonize.co.uk/robots.txt - same thing.

If that happens to googlebot, no wonder it doesn't want to come back...

Update: last attempt eventually returned "Cannot find server or DNS Error"...

simonm
02-20-2004, 04:34 AM
I just got the same problem: The page cannot be displayed

Seems to be the site, as pinging the server '217.204.37.4' and '217.204.37.3' ie NS1 and NS2 seems okay.

Good luck

Simon

paulhiles
02-20-2004, 09:42 AM
I just tried the site and it was fine, there may be some intermittent problem. Do you have a site monitoring service? Internetseer provide a basic service for free, they email regular reports regarding your site's uptime, and response times.

Paul

davebarnes
02-20-2004, 10:35 AM
carbonize,

This would be bad:
"All work is ©Carbonize unless otherwise credited.
You are using Opera 7.23 on Windows 2000
Page Created in 10.49434 seconds"

10+ seconds is bad when I am using a 1.5Mb/s DSL connection.

I saw your homepage hang waiting on awin1.com

,dave

Dragonsi
02-20-2004, 11:28 AM
Carbonize - I just tried viewing your site and my proxy server band it with a code 403 - forbidden...

Do you use adult or other related keywords?

minstrel
02-20-2004, 12:42 PM
The site is back up now - and loaded in IE6 in under 2 seconds according to the script at the bottom of the page (in real time it was of course longer on my dial-up).

carbonize
02-20-2004, 12:45 PM
My server was down for about 3 hours this morning between 0100 and 0400 GMT. My host is crap but cheap to be honest and quite often the servers cpu time is stupid.

I just got Page Created in 0.358364 seconds so you must of caught it at a bad moment, possibly when the host was doing something with the logs like he does every hour.

Dragonsi my works connection has filters in place but my site gets through fine, the only page with problems would be the sex glossary. Nothing on my site is adult related except some smileys and a skin for Yahoo messenger and the sex glossary. None of which are on the index page.

Still none of this would explain why googlebot never visits I'd be happy with once a month.

When I discovered that Google hadn't visited my site for a month in January I resubmitted my site, I then resubmitted it in February. Still nothing.

carbonize
02-20-2004, 12:47 PM
I will add that there is one bug on my site that could in extreme circumstances be a problem. The sub menu is on the left and the code is after the content so the submenu is way down the text when viewed as text only. I can't really ee this being the problem as many sites link directly to some of my pages and I have many index pages.

DrTandem1
02-20-2004, 07:38 PM
carbonize-

I'm wondering, if there is a problem with your coding that is causing the trouble. I'm not thrilled with your title tag, either:

<title>Carbonize.co.uk | Yahoo!, Visual Basic, Tutorials, Adult Smileys, and a whole lot more!</title>

I would remove the URL and Yahoo! and make it read more like plain English.

The site itself took forever to load. Almost like my browser was stuck trying to parse the code. I then looked at the code and you seem to have a lot of errors in it. The site appears okay, once it loads. Maybe I'm just not used to the XTML look of closing tags on line breaks and input tags.

Anyway, I would clean up the title, move it to the first tag after the head tag. I would move the links to the style sheets and stuff to after the meta tags. Maybe add a robots meta tag, but that shouldn't matter.

I think the head section looks more like it's optimized for the search term "Yahoo!" than anything else. However, I think the real problem is whatever is causing the site to take so long to load. I'll research this further, unless someone else has any ideas.

TimH
02-20-2004, 08:04 PM
Dear Carbonized,

We used a cheap hosting service for about 1 month, and it almost destroyed our business. Plus it made me crazy everytime a client would call or email us to tell us our site is down. I think you should find a new hosting service.

P.S. The site looks very nice, and it loaded fine for me.

stevealmond
02-20-2004, 08:32 PM
Hi Carbonize,

I've just done a search on google for www.carbonize.co.uk and you have 6 back links, 719 sites that contain the term "www.carbonize.co.uk", and you are in their index for the search term www.carbonize.co.uk.

However with a search for carbonize you don't appear in the top 20 ( I didn't go any further ). This problem is just an SEO problem and not a google problem.

Could you just have missed the visits from googlebot to your site in your logs. Or maybe your webhost doesn't keep accurate logs. Anyway you are in the google index, the rest is down to you.

Click this link (http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=www.carbonize.co.uk&btnG=Google+Search) to see your results.

One last thing your home page loaded without any problem, so no problem there.

Regards

Steve

minstrel
02-20-2004, 08:46 PM
What really jumps off the page for me about all this is the variability in experiences trying to load the page:

"it loaded fast", "it loaded very slowly", "it stalled half way", "it didn't load at all", "I got a DNS or server error"...

is it the server? something about the coding? does the browser make a difference (I don't think so since depending on when I tried with IE6 I got ALL of the above)?

something very strange here...

and as for Google, how does this variability affect spidering?

DarrenC
02-20-2004, 09:07 PM
No problems for me loading, loaded in 0.32 seconds. Website looks fine with regard to loading and overall navigation of the website is fine.

The only section I am not too comfortable are your META tags


<meta name="Description" content="Pages on everything from Adult smileys for Yahoo! to tutorials on various programming languages." />

Why the / after the " on all of the tags?

I disagree about the comment of it being an SEO problem, yes the website's optimisation needs looking at especially the title, but this would not stop the bot's visiting the site.

Have you tried this tool? Lynx Spider viewer, basically it shows you what the bots see's when spidering your site.

http://www.delorie.com/web/lynxview.cgi?url=http%3A%2F%2Fwww.carbonize.co.uk

The interesting thing is that the bots don't appear to be picking up the title and description when you run this programme.

I'm starting to think it's because of the different css files you are using (the spiders don't read the css files but they could be some confusion with browser capability)

I don't know if this helps..

carbonize
02-20-2004, 10:16 PM
My site is valid XHTML 1.1 and CSS as confirmed by the W3C. The / at the end of the meta tags is how you tell the browser that that tag does not have a closing partner.

As for my logs not catching it I've already tested by making Firefox use the Googlebot useragent string and it logged it just fine. I use a PHP stats script as my host doesn't give me access to the logs. My script also logs all unknown browsers and bots.

As has been pointed out there is a plethora of sites that link to me and they are all listed on Google and yet Googlebot never seems to follow them links. Yes I am listed on Google but my entry is just the URL and the Title no data appears to be stored for my site.

My question to Google was why is my entry empty and why do their spiders never visit my site. I could understand that my site may of been down on a couple of visits but not all. As others have pointed out I'm not the only person who is experiencing this.

smakyyy
02-21-2004, 12:16 AM
WebProWorld :: Viewing profile
... Viewing profile :: carbonize, Joined: 31 Jul 2003. Rank: MVP, Total Posts: 762 Find
all posts by carbonize. User Information. ... Website: http://www.carbonize.co.uk. ...
www.webproworld.com/ profile.php?mode=viewprofile&u=3723 - 23k - Cached - Similar pages

that is in search for your website on google. you do not have the word carbonize prominently listed on your website-- it is there in the source but not in the text that the normal user will see. maybe you should correct this - also you should try to get weblogs so you can analize the logs to be sure if google is coming to you or now.

alienzhavelanded
02-21-2004, 12:50 AM
Carbonize, this has happened to me too, and the current theory is that I had a bad robots file. I've modified it and am still watching to see if there's any difference. If there isn't then I'm in the same boat as you are. My site appears the same way you describe, and it was once visited by Google religiously. After the FL update, this all began, and my rankings are nonexistent with them. I will post if there's a development.

Happy coding,
The Martian

minstrel
02-21-2004, 01:16 AM
In the case of your site, alienzhavelanded, there definitely was a problem with your robots.txt file.

And that raises an interesting point. We do know that googlebot will check for robots.txt on entry to a site - if it doesn't find one (i.e., an error 404 is returned), it proceeds to spider the site, since there are no instructions saying "keep out" or "do not index these parts of the site".

However, at the site carbonize has created, when I request the robots.txt, I don't get the file but I don't get an error 404 either. There seems to be some sort of script redirecting my request to the home page.

Question: how will googlebot handle this unusual event?

Mel
02-21-2004, 01:54 AM
Carbonize - for some reason google is having problems spidering your site. All of your pages listed in the Google index are of the partially indexed type, with a URL only; no title or description.

I am seeing some funny results from Google which sometimes shows 711 pages in the index sometimes, 657 pages in the index,and sometimes 171 pages in the index, but I guess this could be different indexes on different datacenters.

The spidering problem:

When I spider your homepage I get a good result including a long list of links to other pages, but when I try to spider these links I get a 404 error.

The links look like this http://tutorials/c/

which of course is not a valid URL. It may be that you should set your php to generate fuly qualified urls as opposed to relative.

Viewing your page in a browser the above link looks like this:
http://www.carbonize.co.uk/Tutorials/C/
so it appears that there may be some quirks in the way that spiders are seeing your pages.

I suggest that you go to http://www.searchengineworld.com/cgi-bin/sim_spider.cgiand spider your homepage to see the problem.

carbonize
02-21-2004, 08:49 AM
Hm now there's an idea. I use a base href tag to save me a lot of work on my links and images. I'll have a quick go at changing some. But this wouldn't stop googlebot hitting pages that are directly linked or my index pages.

As for the robots.txt it doesn't seem to stop other bots as they would just get the 404 message and treat it as not being there but I'll add one to be sure.

ERRRRRRRRRRRRRRRRR I've had 35 hits from Googlebot today. My question is is this really Googlebot or just you people testing it for me :| lol

carbonize
02-21-2004, 09:11 AM
I'm getting a 403 Forbidden message when I try and access my robots.txt I have errors 400, 401, 403 and 404 all set to return you to the homepage. Why am I gettting a 403 ?????????

minstrel
02-21-2004, 12:29 PM
carbonize, what I was saying above is that you do NOT get a 404 error when you try to access robots.txt - that would be true if the file didn't exist and spiders know how to handle that event. In your case, the file exists - so no error 404 - but the spider isn't allowed to read it because you redirect requests for that file. That was my question: how will googlebot handle this? I don't know the answer... How do other 'bots handle it? Again, no idea but they may not all react in the same way.

The 403 suggests something else - have you read-write protected this file or done something else unsual besides the redirect?

I would suggest as a starting point that you remove the redirect and any "protect" schemes from robots.txt and see if that solves the problem. I'm not sure why you feel a need to hide robots.txt anyway but it is certainly worth "unhiding" it to see if that makes googlebot happy.

If your intent is to hide certain files or directories from 'bots and prying eyes, you can (1) unhide robots.txt and use that file to disallow spidering of those files or directories; AND (2) use htaccess or some other means to redirect requests for those specific files, or limit the permissions for those files or directories, as well as using htaccess to prohibit getting directory listings on your site.

carbonize
02-21-2004, 04:07 PM
I telnetted into my server and checked the httpd.conf file. The host had blocked access to robots.txt. I have sent them a nice email asking them politely to turn access back on as they have no right to block this. I love my host, honest I do. I have turned off the redirect for 403 errors so that they get reported correctly.

Just checked and I now have had 58 hits from Googlebot today so either a lot of people are trying to see if there is a problem or Google finally got the message that I was a bit p****d off at them.

Like buses, none come along for three months then 58 all at once. My entry on the Google database still appears to be empty though. I'm wondering if my scripts not confusing the adsense bot with the googlebot. Anybody got the adsense bots user-agent?

Oh and I think the reason I may of lost the top slot for the term carbonize is because I used to have my blog put my monicker with each entry but I stopped that.

minstrel
02-21-2004, 04:38 PM
I telnetted into my server and checked the httpd.conf file. The host had blocked access to robots.txt. I have sent them a nice email asking them politely to turn access back on as they have no right to block this.
How very odd... why would they WANT to block that file?


Like buses, none come along for three months then 58 all at once.
I really hate when that happens - I have enough trouble trying to make up my mind which parking spot to use when there's more than one empty, or what to order in a restaurant - I'm sure if I had to decide about 58 buses I would just stand there in paralysis until they all left... ;o)

Nargule
02-21-2004, 05:24 PM
Carbonize,

I suggest finding a new webhost and fixing all the things people have pointed out to you.

I think that all these problems with your site not showing up for some people and it not being able to server up webpages (hanging) has caused the google-bot to have a distaste for your site. Getting an error 404 page is one thing, but nothing at all would cause google-bot to hang, waiting for a response. This can slow things down a LOT and it is likely (actually, just a guess on my part) that google-bot has 'marked' your site as being unresponsive and therefore doesn't visit it anymore.

chris_jones
02-21-2004, 06:18 PM
What really jumps off the page for me about all this is the variability in experiences trying to load the page:

"it loaded fast", "it loaded very slowly", "it stalled half way", "it didn't load at all", "I got a DNS or server error"...

is it the server? something about the coding? does the browser make a difference (I don't think so since depending on when I tried with IE6 I got ALL of the above)?

I guess that's what you get for £20 a year hosting.

This is what was happening previously when our site was using post-nuke. I thought it was php-nuke itself (and maybe it was) but our former shared server was oversubscribed (overloaded) which was causing these sorts of random problems.

We eventually moved to another box. It might be worth looking into that or changing hosts.

deltatrend
02-21-2004, 06:31 PM
spiders don't read the css files

I know this is moving off topic, but this is a question I have often wondered about but never found discussed. If spiders don't read css files, then how can they really check for hidden text?

For example, I can set the backgound color of a cell to green, and then apply a style that sets the

to green as well. Maliciously or innocently, this would seem to get around the hidden text issue.

If anyone knows of a good resource, please let me know. At this point I am more concerned about errors using divs that could be picked by the spiders as hidden text, when it is not meant to be.

carbonize
02-21-2004, 06:39 PM
My google listing used to be just fine, and was for as long as I can remember. It's only since December that this trouble has started. Yes my server is over subscribed. I think there is something stupid like 500+ sites per server. Now would be an excellent time for me to change hosts as the dollar isn't worth the paper it's printed on so I can get American hosting for peanuts. But I only renewed my hosting in January and it's not like I'm a business site.

I'll continue to monitor the bots that hit my site and checking my entry in the google database. Altavista loves my site and visits about 50 times or more a day.

Mel
02-22-2004, 04:48 AM
spiders don't read the css files

I know this is moving off topic, but this is a question I have often wondered about but never found discussed. If spiders don't read css files, then how can they really check for hidden text?

For example, I can set the backgound color of a cell to green, and then apply a style that sets the

to green as well. Maliciously or innocently, this would seem to get around the hidden text issue.

If anyone knows of a good resource, please let me know. At this point I am more concerned about errors using divs that could be picked by the spiders as hidden text, when it is not meant to be.

Hi David
IMO there is no cut and dried answer to this.

You can find plenty of hidden text in top ranking pages using either css positioned divs (and more creative things too) and text hidden by making very small or matching the background color, so I don't think they do auto-detection of such things. I also know of cases where positioned off the screen text has been reported to google manually and they have done nothing about it.

In fact somewhere I have seen a statement to the effect that they don't do hand corrrections based on spam reports but try to make the algo do the corrections, which does not seem to be working at the moment.

On the other hand when a well known SEO used the off the screen technique he did get a penalty.

My take on the whole thing, if the hidden text or link is relevant to the page as a whole it is unlikely that they will take action IMO, but if it is totally irrelevant (say trying to use hidden porn text on a kids comic site) or if it is someone who is well known and should know better then they just may take manual action.

Let your conscience (and how much you can afford to lose) be your guide.

carbonize
02-22-2004, 09:47 AM
Surely the best way to have hidden text would be display: none; ??

Ok 58 hits from Googlebot yesterday. Time is now 14:50 and not a single hit today.

carbonize
02-25-2004, 12:43 PM
OK my current host has stopped blocking access to robots.txt files on my domain.

Also I have just purchansed new hosting with prohosters so will soon be moving. I just sent a nasty email to my current host and thoght it would be prudent to find new and better hosting. Also the fact the pound is worth $1.89 had something to do with it :D