iEntry 10th Anniversary Forum Rules Search
WebProWorld
Register FAQ Calendar Mark Forums Read
Yahoo! Discussion Forum Yahoo Search discussion. Any topic or subject specific to Yahoo should go here. You will also find a subforum dedicated to YPN & Panama.

Share Thread: & Tags

Share Thread:

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 07-15-2004, 10:21 AM
WebProWorld New Member
 
Join Date: Jul 2004
Location: India
Posts: 10
sadiq1133 RepRank 0
Default new yahoo spider?

i found this in my log file

Yahoo-MMCrawler/3.x (mms dash mmcrawler dash support at yahoo dash inc dot com)

Is this a new yahoo spider? any info on this?
Reply With Quote
  #2 (permalink)  
Old 07-15-2004, 05:33 PM
Elite Skills's Avatar
WebProWorld Pro
 
Join Date: Oct 2003
Location: Texas
Posts: 279
Elite Skills RepRank 0
Default

IP looks to be from fast. I had it too. It looks like it's picking at the images on my site so MM probably means multimedia. yahoo images search?
__________________
[ Webmaster Education ] [ Webmaster Articles ] [ Poetry
Reply With Quote
  #3 (permalink)  
Old 07-15-2004, 05:55 PM
TrafficProducer's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Jul 2003
Location: United Kingdom
Posts: 1,643
TrafficProducer RepRank 4TrafficProducer RepRank 4TrafficProducer RepRank 4
Default Database of Web Robots

Database of Web Robots

This may or may not help you out?

http://www.robotstxt.org/wc/active.html
Reply With Quote
  #4 (permalink)  
Old 07-15-2004, 06:23 PM
ronniethedodger's Avatar
WebProWorld 1,000+ Club
 
Join Date: Aug 2003
Location: Central US
Posts: 1,265
ronniethedodger RepRank 1
Default

Quote:
Originally Posted by Elite Skills
IP looks to be from fast. I had it too. It looks like it's picking at the images on my site so MM probably means multimedia. yahoo images search?
Are you sure it is from Fast? Or just guessing.

I did search of Internet logs and came across this one here. There are numerous entries in this log to view.

The crawler resolves to mmcrmX.search.scd.yahoo.com were the X is the number of the crawler. Here are some more sitings of the crawler.

It does appear to be requesting graphic formatted files (.jpg, .png, .gif) as well as the directory index of that the files may reside in.

If you want to disallow access to images directory, then I would put an exclusion for it in your robots.txt file. To stop them from requesting a directory index of your images -- insert a blank .htm file in your images directory. This will stop them from reading the index and serve up a blank page instead.
Reply With Quote
  #5 (permalink)  
Old 07-15-2004, 06:39 PM
ronniethedodger's Avatar
WebProWorld 1,000+ Club
 
Join Date: Aug 2003
Location: Central US
Posts: 1,265
ronniethedodger RepRank 1
Default

It appears that the robot is the old AltaVista robot for images which was identified as vscooter in the user-agent field.

It does appear to be operated by Fast though as Elite Skills suggested. Look here.
Reply With Quote
  #6 (permalink)  
Old 07-15-2004, 07:02 PM
cooper's Avatar
WebProWorld Pro
 
Join Date: Jul 2003
Location: San Clemente, CA
Posts: 134
cooper RepRank 0
Default some other spiders

I have noticed the following spiders from one of my client's access logs:

Quote:
msnbot/0.11 ( http://search.msn.com/msnbot.htm)
Googlebot/2.1 ( http://www.googlebot.com/bot.html)
Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
ia_archiver
check_http/1.24.2.4 (nagios-plugins 1.3.1)
NPBot (http://www.nameprotect.com/botinfo.html)
UCmore
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; MSIECrawler)
LinkWalker
sohu-search
Yahoo-MMCrawler
Baiduspider ( http:
Gigabot/1.0
NaverBot-1.0 (NHN Corp. / 82-2-3011-1954 / nhnbot@naver.com)
mozDex
IlTrovatore-Setaccio/1.2 (Indexing; http://www.iltrovatore.it/bot.html; bot@iltrovatore.it)
JoeDog/1.00 [en] (X11; I; Siege 2.59)
QuepasaCreep ( crawler@quepasacorp.com )
Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; MSIECrawler)
TAMU_CS_IRL_CRAWLER
CosmixCrawler
Szukacz
So that's two that identify themselves as Yahoo!
Most of those other I don't even recognize. There were some entries for our own WebTrends reporter, but that's the gist of it for July so far.

I think that Elite Skills has a good guess for what "Yahoo-MMCrawler" means.
__________________
Cooper Griggs
Pure Influence
Aloha, surfing and flower stickers
http://www.pureinfluence.com/
Reply With Quote
  #7 (permalink)  
Old 07-17-2004, 02:00 AM
dkginternet's Avatar
WebProWorld New Member
 
Join Date: Dec 2003
Location: Newark, Ohio
Posts: 7
dkginternet RepRank 0
Default

Hello,

You might consider reserving an additional domain and hosting it on another server containing a copy of the site that you want to backup.

Of course it won't do much good if it's not marketed but it would give you an option when main site is down and the phone starts ringing.

HTH,

Danny
Reply With Quote
Reply

  WebProWorld > Search Engines > Yahoo! Discussion Forum

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -4. The time now is 04:46 AM.



Search Engine Optimization by vBSEO 3.3.0