Contact Us Forum Rules Search Archive
WebProWorld Part of WebProNews.com
Page One Link To Us Edit Profile Private Messages Archives FAQ RSS Feeds  
 

Go Back   WebProWorld > Webmaster, IT and Security Discussion > Web Programming Discussion Forum
Subscribe to the Newsletter FREE!


Register FAQ Members List Calendar Arcade Chatbox Mark Forums Read

Web Programming Discussion Forum Working with an API? Developing a plugin? Writing a Mod or script for your favorite blog, Web 2.0 site or Forum? Welcome.

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 06-30-2004, 05:46 PM
WebProWorld New Member
 

Join Date: Jun 2004
Location: Monrovia, MD USA
Posts: 5
telecomworx RepRank 0
Default HTML In Splash Page to Prevent Webots/Crawlers Secuirty HOLE

1. Do a web search e.g. (www.dogpile.com) on the term: NBX NetSet
2. Now pick out ALL the pages that show something like this: NBX NetSet
Version: RX_X_Xx Created: Xxx xx XXXX,

Now, you've been successful in finding user companies that assign an public IP address to their corporate telephone systems. Besides being a really dumb thing to do- assigning a public IP to their system- the manufacturer isn't too bright either.

Isn't there some simple html that could prevent the page you see (NBX NetSet splash page) from being indexed by the so called Webots or Webcrawlers ? I know the obvious answer is not to assign a public IP to the box and put the box behind a firewall- which the user/customers will open up port 80/html traffic to the box...so wouldn't the Webots/Webcrawlers still index the page ?

What do you think ?

thank you
__________________
"You gotta have a sense of humor else you'll never understand or get to know me"

http://www.telecomworx.com
Reply With Quote
  #2 (permalink)  
Old 06-30-2004, 06:05 PM
WebProWorld 1,000+ Club
 

Join Date: Sep 2003
Location: Texas
Posts: 1,283
flood6 RepRank 0
Default Exclusion

There are two common ways to prevent spiders from indexing your data, excluding them with robots.txt and excluding them with meta data. Both methods are described here.

For bots that misbehave by disregarding robots.txt and the meta instructions, you can try this method to trap them and automatically ban them.

I hope that was what you were asking...

Good luck.
Reply With Quote
  #3 (permalink)  
Old 06-30-2004, 06:10 PM
USALUG's Avatar
WebProWorld Pro
 

Join Date: Aug 2003
Location: USA
Posts: 114
USALUG RepRank 0
Default

Code:
 <meta name="robot" content="noindex,follow">
of course it only works for the "nice" bots. :)
__________________
http://www.usalug.org
USA Linux Users Group
usalug.org is an online forum for Linux users.
Reply With Quote
  #4 (permalink)  
Old 07-02-2004, 08:40 AM
WebProWorld New Member
 

Join Date: Jun 2004
Location: Monrovia, MD USA
Posts: 5
telecomworx RepRank 0
Default Bots, Crawlers, Spiders eeeeeek !

Yes you both have been helpful- thank YOU.

I will pass along.
__________________
"You gotta have a sense of humor else you'll never understand or get to know me"

http://www.telecomworx.com
Reply With Quote
Reply

  WebProWorld > Webmaster, IT and Security Discussion > Web Programming Discussion Forum
Tags: , , , , , ,



Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Search Engine Optimization by vBSEO 3.2.0