PDA

View Full Version : Greg Jarboe Talks Yahoo! Search



Garrett
03-03-2004, 12:25 PM
Yesterday I was fortunate enough to have lunch with Greg Jarboe, President and co-founder of SEO-PR (http://www.seo-pr.com/), who shared some revealing insight into the new Yahoo! Search.

Jarboe learned that the Yahoo! Slurp bot doesn’t follow links as closely as the Google bot. Instead, Slurp pays more attention to on page factors such as keywords and title tags to determine a site’s relevance.

His theory: in an effort to get its own search solution up quickly, it’s possible that Yahoo! just rolled out the already-existing Inktomi technology with a few variations of its own. Within the next month or two, he predicts Yahoo! may also roll out AlltheWeb technology, which takes links into consideration the way Google does.

Yahoo may be less inclined to roll out link analysis because link analysis concerns off page elements that are often out of the user’s control. These factors could potentially diminish your page rankings and you would not necessarily be able to control them at all.

He also has one conjecture. “If you’re standing on the grassy knoll and you look West, where do you make the money? Through paid inclusion or through links?”

Of course, the answer here is paid inclusion. He believes Yahoo! will hold off on the off page factors of AlltheWeb until it begins to get user complaints. This is just his conspiracy theory, however, and he has no evidence to support that statement.

While paid inclusion offers an increased crawl rate of every 48 hours, it doesn't necessarily guarantee your position in the search engine results pages.

Why participate in paid inclusion then? Jarboe is quick to point out that "Yahoo! is the tweaker's dream." With its frequent crawls, Yahoo’s paid inclusion program allows you to tweak on page elements like crazy to easily raise search engine rankings.

That is going to be a major draw for advertisers to participate in the paid inclusion program.

WebMetro
03-04-2004, 06:19 PM
I'm basically looking at Yahoo paid inclusing as Looksmart with a larger reach. The good'ol days of free Yahoo are over. Adapt to the changes and make it work for you, not against you.

Dave Hawley
03-05-2004, 04:12 AM
Has anyone seen Yahoo! Slurp in their logs? Iget one that eats a lot but AwStats has it as "Unidentified".

minstrel
03-05-2004, 04:27 AM
Last I heard it was still identified as just "slurp"...

Mel
03-05-2004, 05:10 AM
Has anyone seen Yahoo! Slurp in their logs? Iget one that eats a lot but AwStats has it as "Unidentified".

Aw stats is known to be a bit slow in updating their software, and I believe this is also dependant on the Host many of whom are not keen to disturb working systems with minor software updates. I have been seeing two unidentified spiders in AW stats,one of which I believe to be the MSN spider, and the other which has just hit my robots.txt 43 times in the past four days without visiting a page. This one may be Yahoo slurp.

Robotstxt.org has a listing for the name MSNbot, but nothing listed for Yahoo slurp only Inktomi slurp.

zenfort
03-05-2004, 08:53 AM
Has anyone seen Yahoo! Slurp in their logs? Iget one that eats a lot but AwStats has it as "Unidentified".

Hi first tine posting.

YET?!! I have had to htaccess 403 the critter already this month. the spiders 66.196.. have already eaten 400MB of bandwidth in the less than 4 days.
I first had to cut off the spiders 66.196.. on the 26th of Feb. 2004. they had consumed 2GB of bandwidth since the 1st of Feb. Thay had been gobbleing more and more bandwidth every month for the past 56 months. First as Inktomi Slurp - now there are recognized as Yahoo Slurp.

I am running PostNuke with PostCalendar and they will not obey however I try to talk to them in robots.txt.

So I've adopted the tactic of forbidding them for periods of time - I guess.

From Latest visitors:
===
Host: 66.196.72.107 Url: /modules.php?op=modload&name=News&file=index&catid=&topic=10&POSTNUKESID=1e78c773952a8f6e889cfa85b37eee44 Http Code : 403
Date: Mar 05 07:41:08 Http Version: HTTP/1.0" Size in Bytes: -
Referer: - Agent: Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)

Host: 66.196.72.45 Url: /delkergrid/techinfo.html Http Code : 403
Date: Mar 05 07:40:42 Http Version: HTTP/1.0" Size in Bytes: -
Referer: - Agent: Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
===

And they even keep going after URLs that I haven't had on this server for years - the Delkergrid URL for instance of a few minutes ago. I know there are tons of links to the xenarts/xenonarts site cause it's been up for so long. I've been trying to transition the name to xenarts but I've about given up.
namaste

minstrel
03-05-2004, 11:33 AM
Has anyone seen Yahoo! Slurp in their logs? Iget one that eats a lot but AwStats has it as "Unidentified".
I have been seeing two unidentified spiders in AW stats,one of which I believe to be the MSN spider, and the other which has just hit my robots.txt 43 times in the past four days without visiting a page. This one may be Yahoo slurp. Robotstxt.org has a listing for the name MSNbot, but nothing listed for Yahoo slurp only Inktomi slurp.
See above: at present "Inktomi slurp" IS "Yahoo slurp", at least as far as your web logs are concerned.

See this thread (http://www.webproworld.com/viewtopic.php?t=14167):

The instructions on the the Yahoo! Help page (http://help.yahoo.com/help/us/ysearch/slurp) indicate that the 'bot is still called slurp:


Yahoo! Slurp obeys the Robot Exclusion Standard. Specifically, Yahoo! Slurp adheres to the 1994 Robots Exclusion Standard (RES). Yahoo! Slurp will obey the first entry in the robots.txt file with a User-Agent containing "Slurp". If there is no such record, it will obey the first entry with a User-Agent of "*".

Disallowed documents, including slash (the home page of the site), are not indexed, nor are links in those documents followed. Yahoo! Slurp does read the home page at each site and uses it internally, but if it is disallowed it is neither indexed nor followed. Example robots.txt:

User-agent: Slurp
Disallow: /cgi-bin/

Mel
03-05-2004, 12:35 PM
Hmmmm..... well maybe Mistrel but then what are those things that identify themselves as yahoo! slurp in Zenforts Logs?


From Latest visitors:
===
Host: 66.196.72.107 Url: /modules.php?op=modload&name=News&file=index&catid=&topic=10&POSTNUKESID=1e78c773952a8f6e889cfa85b37eee44 Http Code : 403
Date: Mar 05 07:41:08 Http Version: HTTP/1.0" Size in Bytes: -
Referer: - Agent: Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
Host: 66.196.72.45 Url: /delkergrid/techinfo.html Http Code : 403
Date: Mar 05 07:40:42 Http Version: HTTP/1.0" Size in Bytes: -
Referer: - Agent: Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)