Contact Us Forum Rules Search Archive
WebProWorld Part of WebProNews.com
Page One Link To Us Edit Profile Private Messages Archives FAQ RSS Feeds  
 

Go Back   WebProWorld > Search Engines > Google Discussion Forum
Subscribe to the Newsletter FREE!


Register FAQ Members List Calendar Arcade Chatbox Mark Forums Read

Google Discussion Forum Google Discussion forum is for topics specifically related to Google. There is a subforum dedicated to AdSense/AdWords subjects.

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 12-07-2003, 10:51 AM
WebProWorld Veteran
 

Join Date: Jul 2003
Location: Mass, U.S.A.
Posts: 434
Conficio RepRank 0
Default Call to arms ...

Hi there,
America, has a long standing tradition of people taking matters in their own hands. Actually the Greek system of citizens is expression of the idea that people living in a state should take things in their own hands, rather than wait for the king (government) to do anything.

Not to offend all Canadians, Australians, Indians, Europeans, Arabs, British or Chinese (or any other world citizen) visitors - isn't that something most of the world can agree on? If I consider the virtual world of the Internet - with all its open software project collaborations - I assume we can agree.

As many of this community are quite angry about the changing world of Google's algorithms and ranking system, why not support an open source search engine project to help change the dynamics of Internet search?

Have you heard of Nutch? It seems quite promising, although their web-site is brief.

As many in this forum are quite knowledgeable about search engines, their algorithms and the results, there should be a way for you to contribute. How about:
  • spreading the word, talk about it?
  • contribute to the discussion of algorithms and solutions to existing problems?
  • helping out with programming, development (for Java programmers), documentation, project identity (design), testing?
  • asking/allowing someone in your business to contribute to the project using your business resources?
  • download the software and test it, build a beta site for your own web-imperium (your sites, customer sites, friends sites) just to get a feel for the quality of the results?
  • make a money (or resource) donation, so we can see a public trial soon? Raise some money through your web-site or a marketing event?
Why should you help others? Well, > 70% of all web servers run on open source software (apache). Not to mention the noumerous scripting languages (perl, PHP4, python, ...) without the business of web-applications would be so much more conplicated.

Take charge, do something!
  • Imagine we could have several competing search engines, that are based on the same open source software.
  • Imagine you could codify your knowledge of how to filter the most relevant content and offer it as a search algorithm/engine to the world.
  • Imagine you could collect your knowledge in a personal directory (like your very own Bookmarks) and aggregate all those well maintained personal directories to a giant base of search index.
  • Imagine...

Here is your chance! What do you think of this option? Do you know other open source search engine projects? Do you think open source can help to build better search engines? How would you use an open source search engine?

To paraphrase an American president: "Don't ask what the Internet can do for you, ask what you can do for the world wide web"

I hope you find this idea worth discussing.
K<o>
Reply With Quote
  #2 (permalink)  
Old 12-07-2003, 12:17 PM
minstrel's Avatar
WebProWorld 1,000+ Club
 

Join Date: Jul 2003
Location: Ottawa, Canada
Posts: 3,619
minstrel RepRank 0
Default

As I recently said elsewhere, I think it's always a good sign when we see new search engines and directories entering the field. However (and no disrespect is intended to Conficio by any of my comments below), in the case of Nutch, the use of the word "brief" to describe their website is an understatement:

Quote:
Nutch is a nascent effort to implement an open-source web search engine.
translation: it doesn't actually exist yet

Quote:
Nutch provides a transparent alternative to commercial web search engines. Only open source search results can be fully trusted to be without bias. (Or at least their bias is public.) All existing major search engines have proprietary ranking formulas, and will not explain why a given page ranks as it does. (snip) Nutch, on the other hand, has nothing to hide and no motive to bias its results or its crawler in any way other than to try to give each user the best results possible.
If they don't make some attempt to disguise their "ranking formulas (sic)", my guess is that it will take about 20 minutes max for the bottom-feeders in the internet industry to find a way to distort the results in a way that gives undeserved prominence in the listings to their sites. That's the reason current SEs need to be somewhat secretive about how they do things.

Quote:
Nutch aims to enable anyone to easily and cost-effectively deploy a world-class web search engine. This is a substantial challenge. To succeed, Nutch software must be able to:
- fetch several billion pages per month
- maintain an index of these pages
- search that index up to 1000 times per second
- provide very high quality search results
- operate at minimal cost
They left out
- provide relevant search results that are not dominated by spammers, spoofers, and porn sites

Quote:
This is a challenging proposition. If you believe in the merits of this project, please help out, either as a developer or with a donation
Did someone say "donation"? They don't actually exist yet but they are hoping for "donations"? Did I mention that I still have several hundred copies left of "How To Get Rich On The 'Net" for the low low price of $50 or two for $125?
Reply With Quote
  #3 (permalink)  
Old 12-07-2003, 04:13 PM
WebProWorld Veteran
 

Join Date: Jul 2003
Location: Mass, U.S.A.
Posts: 434
Conficio RepRank 0
Default Dont' dismiss the effort that fast

Quote:
Originally Posted by minstrel
If they don't make some attempt to disguise their "ranking formulas (sic)", my guess is that it will take about 20 minutes max for the bottom-feeders in the internet industry to find a way to distort the results in a way that gives undeserved prominence in the listings to their sites. That's the reason current SEs need to be somewhat secretive about how they do things.
In encryption and security circles the prevailing opinion is "If it is not a published algorithm, don't trust it". I think this could be applied to search engines as well. I don't think that secrecy is too helpful (reverse engineering is too powerful) and if it is the only way to get decent ranking systems now, let's think of a system that can't be beaten up by the "bottom feeders"

How about community voting on the relevancy of sites? Make sure you identify each voter. Ask them for a small fee (like $5 for a year - this could be easily rolled into your ISP offer - but it gets too expensive to "buy" massive votes). Limit the number of votes, so no one can spamm the voting process and people value each vote they give. Build in a certain percentage for new and random results, so the not so well known have a chance too.

Quote:
Originally Posted by minstrel
They left out
- provide relevant search results that are not dominated by spammers, spoofers, and porn sites
How about quality = relevant - spammmers - spoofers - porn sites (although those are popular search keywords on the web - so exclusion would be some form of censorship)

Ministrel, think of it as basic research. If you don't develop a syringe, you won't be able to give people medication in the most powerful way.

The result algorithm, while being obvious is only a part of the challenge to build a search engine. The ability to query massive and scalable databases and to collect the URLs as well as all relevant data for the ranking algorithm is more than half of the problem.

Also, not every search engine needs to cover the world. Open source search technology does allow you to build with little cost your own dedicated search engine and that can be as powerful as the many web-sites build by millions of people, just because a cheap web-server (apache) is available.

Remember it is open source. You have the ability to change and extend like apache web-server have been extended by mod_perl or mod_php4. Make up your own schema of ranking and compete with the others.

Quote:
Originally Posted by minstrel
They don't actually exist yet but they are hoping for "donations"?
Yes! That is the way things can be done. Thousands of people have donated a small amount to their favorite candidate for president in the USA in the last months. They aren't actually presidents yet, but they hope for donations in order to pay for their campaign. What is wrong with that?

By the way they don't ask for money to enrich themselves. They ask for money to do a world wide community service - developing an open source search engine. They can and do donate their own time and effort, but they also need a server farm and a lot of bandwith, if they want to demonstrate publicly their work.

As frustration with current search engines run high, A small donation - let's say $10 or $20 - should be no harm to your richness, but could go a long way to give us change and a better search engine future.

K<O>
Reply With Quote
  #4 (permalink)  
Old 12-07-2003, 04:31 PM
minstrel's Avatar
WebProWorld 1,000+ Club
 

Join Date: Jul 2003
Location: Ottawa, Canada
Posts: 3,619
minstrel RepRank 0
Default

Well, I wish Nutch luck but I won't be sending them any money any time soon. For one thing, there are other directories and search engines that are relatively new and relatively small and they aren't asking for donations. For another, I'm not personally a huge fan of open source projects. And for another, I'm not unhappy with the way free enterprise/capitalism has served us to date - I may be in a minority in these forums in that respect although I'm not convinced of that.
Reply With Quote
Reply

  WebProWorld > Search Engines > Google Discussion Forum
Tags: ,



Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Search Engine Optimization by vBSEO 3.2.0