iEntry 10th Anniversary Forum Rules Search
WebProWorld
Register FAQ Calendar Mark Forums Read
Google Discussion Forum Google Discussion forum is for topics specifically related to Google. There is a subforum dedicated to AdSense/AdWords subjects.

Share Thread: & Tags

Share Thread:

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 04-13-2004, 10:37 AM
WebProWorld Member
 
Join Date: Feb 2004
Location: long island, NY
Posts: 63
volkoman RepRank 0
Default Off limits.... googlebot only

Out of curiosity I want to test some robot code.

For a "Google only" site what robot code works best....

meta or robot.txt

Will "all" SE's really follow these street signs????

If one wanted Google to be the ONLY SE to spider a site .... how best to write the code?
Reply With Quote
  #2 (permalink)  
Old 04-13-2004, 01:00 PM
WebProWorld Member
 
Join Date: Mar 2004
Posts: 35
spidersam RepRank 0
Default

There is a way to do it, but I can't quite remember how...something to do with naming the bot I think.

Sorry I forgot, but I'm sure many will be able to help you out with that one.
__________________
Communism, Dada Art, and Venereal Disease meet Fidel Castro!
Reply With Quote
  #3 (permalink)  
Old 04-13-2004, 09:09 PM
ronniethedodger's Avatar
WebProWorld 1,000+ Club
 
Join Date: Aug 2003
Location: Central US
Posts: 1,265
ronniethedodger RepRank 1
Default

Why would you want to prevent all but one spider from crawling your site. I can see disallowing certain spiders (one or two come to mind), but not all of them.

To answer your question, most spiders do follow the rules laid out in the robots.txt file. If you are concerned about spiders that do not follow the rules, then robots.txt will not stop them. You will have to ban their IP from access.

For the ones that do follow the rules in this file, there is very basic language to follow. There are two statements in the file user-agent and disallow. You can use the * to signify all robots such as this:

User-agent: * # applies to all robots
Disallow: /images/
Disallow: /includes/

Or individual user-agent strings for particular robots:

User-agent: linksmanager # applies to linksmanager only
Disallow: /

This will tell that pesky little sucker not to go anywhere -- Go away. It does honor that, and will not go any further.

Unfortunately, that is as extensive as you can get with this syntax. If there were an allowthat could be used, then it would be easy to do what you are asking. In order to disallow all but Google, then you will have to set up a user-agent/disallow combo for each spider. You will also have to know the name of the spider to put in the user-agent field.
Reply With Quote
  #4 (permalink)  
Old 04-13-2004, 10:58 PM
WebProWorld Member
 
Join Date: Feb 2004
Location: long island, NY
Posts: 63
volkoman RepRank 0
Default robot direction

Thanks ronniethedodger....

Correct..I also found no way to easily do this unless an "allow only" was invented...

Is there another way to optimize a page for Google AND Yahoo AND AOL ..etc

Who wants to mess with a page 1 google rank just to try an optimize better on hotbot.

One restless idea I had was to custom taylor a page to each search engine....

/google/widgets.htm
/yahoo/widgets.htm
msn,aol etc....

A sites index and pages would keep their links, PR and be indexed by all the SE robots as it is now. A sub page for "widgets" could be copied to its own bot folder...(its not a duplicate page if a SE can only read 1 "its own" copy)

googlebot comes as usual...starts at the index...goes to all the follow pages...and ONLY selected google optimized custom bot pages.

The page would have the same page rank (for google).....

Then an allow "slurp" only page..etc

Samething.. slurp bot comes, goes to the slurp pages and has its own slurp rank.

Now minor tweeking/page optimizing can be done for/toward a specific SE.

I know the fine line of "make pages for the viewer, not the SE" comes first...but you are.

Its not about duplicate pages, hiding keywords or misleading content...Im talking about the same "relevent to the page content" with little changes like adding H1 tags or even lessining keyword density to "hit" the selected SE's algo on the head.
There's SEO and then there's CSEO. (custom)

Think it would have worked???? or can???

Pipe dream or maybe I should check for a gas leak :)
Do tell....
Reply With Quote
  #5 (permalink)  
Old 04-13-2004, 11:26 PM
ronniethedodger's Avatar
WebProWorld 1,000+ Club
 
Join Date: Aug 2003
Location: Central US
Posts: 1,265
ronniethedodger RepRank 1
Default

In your own words -- "check for a gas leak". ;0)

What would you do differently for one bot, that you wouldn't for another? What if you did know what the magic formula was, and they changed it?

Keep it simple. Write strong titles, include your keyword and description metas, write good content, and get good backlinks. There are a number of articles on the subject.

The more time you waste on tricking bots, or even just writing for them in mind does a diservice to your visitors and quite frankly wastes a lot of time. This time could be spent in building your site up with more content and make it of value to your visitors.

As you build the site and add to it, the bot stuff will handle itself. And you will have three or four times as much content to get found with -- more variety. More variety means more chances of getting found for your keywords and terms. It also lights a fire under a bots butt to keep up with you.

Don't add pages to just be adding pages. Go for quality, structure your site well ... and it will all fall into place.
Reply With Quote
  #6 (permalink)  
Old 04-14-2004, 06:23 AM
WebProWorld Pro
 
Join Date: Apr 2004
Location: Philadelphia
Posts: 212
sem-seo-pro RepRank 0
Default robots text

Hi

An easier way to do this is write a robots txt file for googlebot

and another robots.txt for all the others.


<META Name="Googlebot" Content="index,follow">
<META Name="Robots" Content="no index,no follow">

Just that simple.

Clint
__________________
Search Engine Marketing
Search Engine Optimization Blog
"The only thing not possible, is whatever you tell yourself is impossible"....
Reply With Quote
  #7 (permalink)  
Old 04-14-2004, 08:25 AM
WebProWorld Member
 
Join Date: Mar 2004
Location: United Kingdom
Posts: 93
spidermonkey RepRank 1
Default

sem-seo-pro You are referring to the robots meta tag here not the robots.txt file in the root directory of your site. If you don't have one - write one - sharpish.

Mike
Reply With Quote
  #8 (permalink)  
Old 04-14-2004, 04:09 PM
ronniethedodger's Avatar
WebProWorld 1,000+ Club
 
Join Date: Aug 2003
Location: Central US
Posts: 1,265
ronniethedodger RepRank 1
Default Re: robots text

Spidermonkey....not only that, but this seems to be telling Googlebot to do two different things. I do think that Googlebot will read the Robots Meta and honor that as well (not positive). But it is definitely telling all other robots not to index or follow links.

Quote:
Originally Posted by sem-seo-pro
<META Name="Googlebot" Content="index,follow">
<META Name="Robots" Content="no index,no follow">
This is bad advice.
Reply With Quote
  #9 (permalink)  
Old 04-14-2004, 04:41 PM
WebProWorld Member
 
Join Date: Feb 2004
Location: long island, NY
Posts: 63
volkoman RepRank 0
Default bot not

I agree Ron....

Not %100 sure either but not worth the effort/chance.

A bot would read the first meta (ok to follow)

then at the 2nd meta it would leave (along with all the other bots)

Hey, like I said.... Just a wild thought.

Its great to be able to think out loud here.
...and not just hear myself answer :)

Thanks
Reply With Quote
  #10 (permalink)  
Old 04-20-2004, 02:03 PM
WebProWorld Member
 
Join Date: Feb 2004
Location: long island, NY
Posts: 63
volkoman RepRank 0
Default # 2 pencil

Ok Ok I still hear voices....

Am I creating my own catch22...

You have a site that is all about good content, has quality links and now needs to be found.

You can have the greatest site on the net...if it aint found in a search, it aint worth doodely.

So it needs to be SE & spider friendly..SEOed.

Well isnt that optimizing "for" a search engine??

Being #1 on G and #30 on MSN can drive ya nuts.

So a site is "generally optimized" for ALL engines and hopefully be ranked highly by each SE's algo.

Well they are all friekin different robotic pains in the algo....

If it were only Google...all ya need is good content, links links and more links...and ya dont even have to submit it...

What about the rest...adding a keyword or slightly tweaking a meta or description can make or break a page 1 listing.

If you can provide the intended quality content with no duplicate pages....whats wrong with formatting it for a specific search engine?

If I fill in all the boxes with the right answers but dont use a #2 pencil...why should I fail the test?
Reply With Quote
  #11 (permalink)  
Old 04-20-2004, 04:45 PM
WebProWorld Pro
 
Join Date: Sep 2003
Location: Mars
Posts: 171
alienzhavelanded RepRank 0
Default

I dont think the issue here is whether optimizing for a specific SE is wrong. The issue was HOW. But I digress...Ive always done nothing but the standard optimization techniques, and if I'm spidered, I'm spidered. I concentrate on creating the pages for my visitors, not the search engines.

If you create a good site with some basic SE principles, its not hard at all to get indexed and ranked:

1. Use a robots file. While you cannot specifically "allow" any one bot, it gives you some great control over all robots and what they can/cant spider.

2. Use good HTML code. Most people don't think the robots pay attention to that, but they do. Googlebot in particular.

3. Have a good site structure. Lots of spiders love site maps.

4. Use Title, Description, and Keyword metas. Your Title meta appears in your indexed listing,usually with the description below it. Although not relied on as much, they are still a factor. Include metas like "revisit" , "index-follow", etc where needed. Don't go overboard with metas, as they are a smaller factor overall.

5. Don't use an all Flash site or home page. Most spiders still have trouble with them. Provide an index page with the OPTION of visiting the Flash enhanced version.

6. Read the Web Master or Help sections at the SE's sites for most of their basic rules.

7. Good anchor text, especially if you have a site with a ton of content!

8. Content, content, content! Proof that all of these basic principles work : http://www.marznetproductions.com/computing

This site is my busiest, and is ranked well accordingly. Largely because of the content. Only basic SE techniques were used. The rankings are fluctuating at the moment because of a switch from HTML to ASP, but otherwise...the spiders love this site.

Last but not least, Volko...you could just exclude all bots except Googlebot.

P.s: Slurp has gotta be the hardest working bot in the biz, he never sleeps! Enjoy, as this is likely my longest post all year.

Happy coding,
The Martian
Reply With Quote
  #12 (permalink)  
Old 04-20-2004, 05:04 PM
ronniethedodger's Avatar
WebProWorld 1,000+ Club
 
Join Date: Aug 2003
Location: Central US
Posts: 1,265
ronniethedodger RepRank 1
Default

Quote:
P.s: Slurp has gotta be the hardest working bot in the biz, he never sleeps! Enjoy, as this is likely my longest post all year.
Maybe on Mars he is. Let me know when that thing shows up here on Earth. ;0)
Reply With Quote
  #13 (permalink)  
Old 04-20-2004, 05:16 PM
WebProWorld Pro
 
Join Date: Apr 2004
Location: Philadelphia
Posts: 212
sem-seo-pro RepRank 0
Default Meta Tags Robots

Hi

Gotta love the forum, but I do take offense that what I posted, is labeled as bad advice.

The man asked for a way to do something, I gave him an answer that does work.

Thank you

Clint
__________________
Search Engine Marketing
Search Engine Optimization Blog
"The only thing not possible, is whatever you tell yourself is impossible"....
Reply With Quote
  #14 (permalink)  
Old 04-20-2004, 05:24 PM
WebProWorld Pro
 
Join Date: Sep 2003
Location: Mars
Posts: 171
alienzhavelanded RepRank 0
Default

It was bad advice because it wouldnt do what he wanted it to do. Not to mention all it does is tell the robot to leave immediately when it reads the second meta. Take offense all you want, but its not a solution. Are you sure you're not just mad because you were wrong? LOL

Happy coding,
The Martian
Reply With Quote
  #15 (permalink)  
Old 04-21-2004, 05:25 AM
WebProWorld Pro
 
Join Date: Mar 2004
Location: Bonnie Scotland
Posts: 103
colr RepRank 0
Default

Easy tigers!
__________________
Colin Reid
East Kilbride
Reply With Quote
  #16 (permalink)  
Old 04-21-2004, 09:08 AM
WebProWorld Member
 
Join Date: Feb 2004
Location: long island, NY
Posts: 63
volkoman RepRank 0
Default find any water

Ground control to Mars man,

Roger that...the "Why not" is kinda fueling the "How to"... and the exclude/except was my first logical choice but naming ALL the bots is too messy.

Thanks for the time/answer... got some questions

1-6 Got it, do it, believe it.
7 got some, working on more

8 Your content content content is all about "The history of the computer".....I did a search for computer history & history of the computer. ??? What keyword(S) are you saying give you the great ranking. I can see you getting lots of traffic from a combination of names and other keywords but isnt your primary target "computer history".



Perhaps I will put the SE specific idea to rest....or at leased until it comes lookin for me :)
Reply With Quote
  #17 (permalink)  
Old 04-21-2004, 02:35 PM
WebProWorld Pro
 
Join Date: Sep 2003
Location: Mars
Posts: 171
alienzhavelanded RepRank 0
Default

volkoman said:
Your content content content is all about "The history of the computer".....I did a search for computer history & history of the computer. ??? What keyword(S) are you saying give you the great ranking. I can see you getting lots of traffic from a combination of names and other keywords but isnt your primary target "computer history".


My "primary target" is my visitors, not SEs. Ive created a site where people can get alot of info about computer history in one place. As a secondary result, the spiders love it. Many of the subjects on the site rank well when searching for them.

It doesn't take a rocket scientist to realize that trying to rank on a popular keyword like "computer history" would likely bury you in the SERPs, but this wasn't my goal anyway and the site ranks for other terms. Try searching on those instead. I did mention the rankings are fluctuating because of a switch to ASP, but like all SEO it takes time.

Happy coding,
The Martian
Reply With Quote
  #18 (permalink)  
Old 04-21-2004, 02:57 PM
WebProWorld Pro
 
Join Date: Mar 2004
Location: India
Posts: 128
daxesh RepRank 0
Default

As it is known content is the king if you have great content thn you just need to tweek a litte to make it SEO friendly and over time get you good rankings in most of the search engines.
__________________
------------------
Dax

(Online Marketing Consultant)
Reply With Quote
  #19 (permalink)  
Old 04-21-2004, 05:15 PM
ronniethedodger's Avatar
WebProWorld 1,000+ Club
 
Join Date: Aug 2003
Location: Central US
Posts: 1,265
ronniethedodger RepRank 1
Default

Quote:
Originally Posted by alienzhavelanded
My "primary target" is my visitors, not SEs. Ive created a site where people can get alot of info about computer history in one place. As a secondary result, the spiders love it. Many of the subjects on the site rank well when searching for them.

It doesn't take a rocket scientist to realize that trying to rank on a popular keyword like "computer history" would likely bury you in the SERPs, but this wasn't my goal anyway and the site ranks for other terms. Try searching on those instead. I did mention the rankings are fluctuating because of a switch to ASP, but like all SEO it takes time.
Bingo !!!

Content, content, content. The more you got, the more chances you have to be found.

Beats the heck out of tweaking your pages continually and playing SE games trying to get ranked for a two-word term. If you include that term liberally around in your content, it will bubble to the surface....and you will have more variances on that term to be found also.
Reply With Quote
  #20 (permalink)  
Old 04-21-2004, 08:40 PM
WebProWorld Member
 
Join Date: Feb 2004
Location: long island, NY
Posts: 63
volkoman RepRank 0
Default bingo

Thanks ronnie, space man and the rest...

Being a head strong, self taught newbie...I will take a deep breath.

My view point....

I "have" a well organized site built to sell something. It looks good, navigates well and has 100% content to what it sells.

The content is "for" the visitor with no misleading BS.. strait up & honest. The SE isn't buyin' it, the visitor is (i hope).

Up to here is where i believe we all agree....but ya do have to be found.

An info only/content site is a wonderful thing..I search for answers too. I use em and thank the author/webmaster as well as the WWW for allowing such a great convienient tool.

But i'm sure that "most" info only sites, sooner or later, have an alternate use....advertising and promotion.

Maybe its me... but most info sites provide me with great content, thank you :) AND "click me's" (optional).

Advertisements making that "for the visitor" site some cash. (I have no problem with this).

AND if the PR is high, the links it must have. And if i'm not mistaken, are linked "for" SE's to gain a sites popularity (since SE's like that). I have no problem with this either. Link on, man!

Is my wording bad...all I read is how SEO's tweak sites to be better found "for" SE's.

Now what about a site that sells widgets. A good site with great content all about widgets. How is the site gonna be profitable if its not found under "widgets" in any word phrase.

If the site falls off the first page on Google and sales drop 40% do you say...

oh well, were still found under cricket matches?

At that point, does not every single SEOer work that site "for" Google.

The content is still there...everything is still there...but some tweaking was done to preserve a sites income.

I may be all over the place by now...and I do not mean to be difficult. I wish not to be confused or labled as some type of SE stalker.

I want to provide content content content "for" visitors wanting to learn about & hopefully buy "widgets"....AND be SEOed "for" SE's under the keyword "widgets" to do so.

That is the game, isn't it?

Who wants to be buried in SERP's, I want to be number friekin 1. (fair & square)

Please...I may have lots of rough questions but im a young dog wanting to learn new tricks...easy with the newspaper on the snout.

....and bingo was his name-o
Reply With Quote
  #21 (permalink)  
Old 06-03-2004, 04:10 AM
WebProWorld Pro
 
Join Date: Mar 2004
Location: India
Posts: 128
daxesh RepRank 0
Default

Content is the king content acts as a magnet for both spiders as well as humans. Invariably good authoritive content acts as a booster for gaining lots of incoming links.




search engines optimizationNigritude Ultramarine[/url
Reply With Quote
  #22 (permalink)  
Old 06-03-2004, 04:55 AM
WebProWorld Veteran
 
Join Date: May 2004
Location: London, UK
Posts: 552
pedstersplanet RepRank 0
Default

Quote:
Originally Posted by alienzhavelanded
4. [snip]...a factor. Include metas like "revisit" , "index-follow", etc where needed. Don't go overboard with metas, as they are a smaller factor overall.
Hmmm, these tags.. Some people say theyre useful, but others say theyre useless.... So who's right?
__________________
Regards, Peter
UK Web Hosting | Website Directory
Reply With Quote
  #23 (permalink)  
Old 06-03-2004, 07:59 AM
Mel Mel is offline
WebProWorld 1,000+ Club
 
Join Date: Jul 2003
Posts: 1,903
Mel RepRank 2Mel RepRank 2
Default

Depends on what you want the bot to do. If you put:
<meta name="robots" content="noindex,nofollow"> all well behaved robots should leave immediately.

<meta name="robots" content="noindex,follow"> they should not index that page but should follow the links on that page to wherever they lead.

<meta name="robots" content="index,nofollow"> they should index that page but not follow any links

<meta name="robots" content="index,follow"> is a waste of time as this is the default action of the bot if he does not find a robots tag on the page.

<meta name="revist-after" content="3 days"> is IMO a worthless tag as it is a restrictive tag which tells the spiders what not to do (thus in effect saying go away and don't come back for 3 days), and at any rate spiders seem to visit on their own schedule even with this tag present.
__________________
Mel Nelson
Expert SEO | Cheap used cars
Reply With Quote
  #24 (permalink)  
Old 06-03-2004, 03:31 PM
WebProWorld New Member
 
Join Date: May 2004
Location: United States
Posts: 22
GoToTrafficSchool.Com RepRank 0
Default Re Visit

I think the re-visit tag actually works.
IM not sure though
__________________
GoToTrafficSchool.Com Traffic School Online.
TeenDrivingCourse.Com Drivers Education and Driver Ed Course.
Reply With Quote
  #25 (permalink)  
Old 06-03-2004, 05:53 PM
cbp cbp is offline
WebProWorld 1,000+ Club
 
Join Date: Oct 2003
Posts: 4,938
cbp RepRank 1
Default

Quote:
I think the re-visit tag actually works.
Its useless.

CBP
Reply With Quote
Reply

  WebProWorld > Search Engines > Google Discussion Forum

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -4. The time now is 11:06 AM.



Search Engine Optimization by vBSEO 3.3.0